MySQL Select multiple columns group by sorted columns values - mysql

I have this table columns structure:
id - n1 - n2 - n3
And here it is with some dummy data:
id - n1 - n2 - n3
1 - 3 - 2 - 1
2 - 6 - 5 - 7
3 - 2 - 3 - 1
4 - 1 - 6 - 5
5 - 5 - 6 - 7
6 - 3 - 5 - 6
And the idea is to Select and count each unique distinct group of n1, n2 and n3 in sequence.
So, for example, we could get this result:
total - n1s - n2s - n3s
2 - 1 - 2 - 3
2 - 5 - 6 - 7
1 - 1 - 5 - 6
1 - 3 - 5 - 6
Can you help me set the state to achieve that??
I am trying to attempt that without multiple selects and PHP array sorting...
Thanks.

Consider the following - a normalised dataset...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id INT NOT NULL
,n INT NOT NULL
,val INT NOT NULL
,PRIMARY KEY(id,n)
);
INSERT INTO my_table VALUES
(1, 1, 3),
(1, 2, 2),
(1, 3, 1),
(2, 1, 6),
(2, 2, 5),
(2, 3, 7),
(3, 1, 2),
(3, 2, 3),
(3, 3, 1),
(4, 1, 1),
(4, 2, 6),
(4, 3, 5),
(5, 1, 5),
(5, 2, 6),
(5, 3, 7),
(6, 1, 3),
(6, 2, 5),
(6, 3, 6);
Here's a quick (to write) and dirty solution. Faster / more elegant solutions are available...
SELECT vals
, COUNT(*) total
FROM
( SELECT id
, GROUP_CONCAT(val ORDER BY val) vals
FROM my_table
GROUP
BY id
) x
GROUP
BY vals;
+-------+-------+
| vals | total |
+-------+-------+
| 1,2,3 | 2 |
| 1,5,6 | 1 |
| 3,5,6 | 1 |
| 5,6,7 | 2 |
+-------+-------+

We just need expressions to "sort" the values in columns n1, n2 and n3. If we have that, then we can do a simple GROUP BY and COUNT.
SELECT COUNT(1) AS total
, IF(t.n1<=t.n2,IF(t.n1<=t.n3,t.n1,t.n3),IF(t.n2<=t.n3,t.n2,t.n3)) AS n1s
, IF(t.n1<=t.n2,IF(t.n2<=t.n3,t.n2,IF(t.n1<=t.n3,t.n3,t.n1)),IF(t.n1<=t.n3,t.n1,IF(t.n2<=t.n3,t.n3,t.n2 ))) AS n2s
, IF(t.n1<=t.n2,IF(t.n2<=t.n3,t.n3,t.n2),IF(t.n1<=t.n3,t.n3,t.n1)) AS n3s
FROM this_table_column_structure t
GROUP BY n1s,n2s,n3s
ORDER BY total DESC, n1s, n2s, n3s
will return
total n1s n2s n3s
----- ---- ---- ----
2 1 2 3
2 5 6 7
1 1 5 6
1 3 5 6

As a first approach (if time permits), you should really consider normalizing your table, as suggested in #Strawberry's answer
However, a second approach allowing any number of columns (although inefficient due to String operations and Bubble Sorting) is possible, utilizing User Defined Functions.
We basically need to create a function, which can sort the values inside a comma separated string. I found a working function, which can do the sorting. Reproducing code from here:
-- sort comma separated substrings with unoptimized bubble sort
DROP FUNCTION IF EXISTS sortString;
DELIMITER |
CREATE FUNCTION sortString(inString TEXT) RETURNS TEXT
BEGIN
DECLARE delim CHAR(1) DEFAULT ','; -- delimiter
DECLARE strings INT DEFAULT 0; -- number of substrings
DECLARE forward INT DEFAULT 1; -- index for traverse forward thru substrings
DECLARE backward INT; -- index for traverse backward thru substrings, position in calc. substrings
DECLARE remain TEXT; -- work area for calc. no of substrings
-- swap areas TEXT for string compare, INT for numeric compare
DECLARE swap1 TEXT; -- left substring to swap
DECLARE swap2 TEXT; -- right substring to swap
SET remain = inString;
SET backward = LOCATE(delim, remain);
WHILE backward != 0 DO
SET strings = strings + 1;
SET backward = LOCATE(delim, remain);
SET remain = SUBSTRING(remain, backward+1);
END WHILE;
IF strings < 2 THEN RETURN inString; END IF;
REPEAT
SET backward = strings;
REPEAT
SET swap1 = SUBSTRING_INDEX(SUBSTRING_INDEX(inString,delim,backward-1),delim,-1);
SET swap2 = SUBSTRING_INDEX(SUBSTRING_INDEX(inString,delim,backward),delim,-1);
IF swap1 > swap2 THEN
SET inString = TRIM(BOTH delim FROM CONCAT_WS(delim
,SUBSTRING_INDEX(inString,delim,backward-2)
,swap2,swap1
,SUBSTRING_INDEX(inString,delim,(backward-strings))));
END IF;
SET backward = backward - 1;
UNTIL backward < 2 END REPEAT;
SET forward = forward +1;
UNTIL forward + 1 > strings
END REPEAT;
RETURN inString;
END |
DELIMITER ;
You will need to run this code on your MySQL server, so that this function is available within a query, just like native built-in MySQL functions. Now, the querying part becomes simple. All you need to do is Concat_ws() all the number columns using comma. And, then apply sortString() function on the concatenated string. Eventually, use the "ordered" string in Group By clause, to get the desired result.
Try:
SELECT sortString(CONCAT_WS(',', n1, n2, n3)) AS n_sequence -- add more columns here
COUNT(id) AS total
FROM your_table
GROUP BY n_sequence
ORDER BY total DESC
Now I suggest that you can use your application code to change comma separated n_sequence back to tabular column display.

Related

Order by for column in varchar type

I have the following column strand which is ordered in ascending order but its taking 3.10 as next after 3.1 instead of 3.2..
the column is varchar type..
Strand
3.1
3.1.1
3.1.1.1
3.1.1.2
3.1.2
3.1.2.1
3.10 # wrong
3.10.1 # wrong
3.10.1.1 # wrong
3.2 <- this should have been after 3.1.2.1
3.2.1
3.2.1.1
..
3.9
3.9.1.1
<- here is where 3.10 , 3.10.1 and 3.10.1.1 should reside
I used the following query to order it;
SELECT * FROM [table1]
ORDER BY RPAD(Strand,4,'.0') ;
how to make sure its ordered in the right way such that 3.10,3.10.1 and 3.10.1.1 is at last
Try this:
DROP TABLE T1;
CREATE TABLE T1 (Strand VARCHAR(20));
INSERT INTO T1 VALUES ('3.1');
INSERT INTO T1 VALUES('3.1.1');
INSERT INTO T1 VALUES('3.1.1.1');
INSERT INTO T1 VALUES('3.1.1.2');
INSERT INTO T1 VALUES('3.2');
INSERT INTO T1 VALUES('3.2.1');
INSERT INTO T1 VALUES('3.10');
INSERT INTO T1 VALUES('3.10.1');
SELECT * FROM T1
ORDER BY STRAND;
SELECT *
FROM T1
ORDER BY
CAST(SUBSTRING_INDEX(CONCAT(Strand+'.0.0.0.0','.',1) AS UNSIGNED INTEGER) *1000 +
CAST(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(Strand,'.0.0.0.0'),'.',2),'.',-1) AS UNSIGNED INTEGER) *100 +
CAST(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(Strand,'.0.0.0.0'),'.',3),'.',-1) AS UNSIGNED INTEGER) *10 +
CAST(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(Strand,'.0.0.0.0'),'.',4),'.',-1) AS UNSIGNED INTEGER)
Output not ordeded:
Strand
1 3.1
2 3.1.1
3 3.1.1.1
4 3.1.1.2
5 3.10
6 3.10.1
7 3.2
8 3.2.1
Output Ordered:
Strand
1 3.1
2 3.1.1
3 3.1.1.1
4 3.1.1.2
5 3.2
6 3.2.1
7 3.10
8 3.10.1
you can order the result baset on the integer value of your field. your code will looks like
select [myfield]from [mytable] order by
convert(RPAD(replace([myfield],'.',''),4,0),UNSIGNED INTEGER);
in this code replace function will cleand the dots (.)
hope thin help
You must normalize each group of digits
SELECT * FROM [table1]
ORDER BY CONCAT(
LPAD(SUBSTRING_INDEX(Strand,'.',1),3,'0'), '-',
LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX(Strand,'.',2),'.',-1),3,'0'), '-',
LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX(Strand,'.',3),'.',-1),3,'0'), '-',
LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX(Strand,'.',3),'.',-1),3,'0'));
sample
mysql> SELECT CONCAT(
-> LPAD(SUBSTRING_INDEX('3.10.1.1','.',1),3,'0'), '-',
-> LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX('3.10.1.1','.',2),'.',-1),3,'0'), '-',
-> LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX('3.10.1.1','.',3),'.',-1),3,'0'), '-',
-> LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX('3.10.1.1','.',3),'.',-1),3,'0'));
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| CONCAT(
LPAD(SUBSTRING_INDEX('3.10.1.1','.',1),3,'0'), '-',
LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX('3.10.1.1','.',2),'.',-1),3,'0'), '-',
LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX('3.10.1.1','.',3),'.',-1),3,'0'), '-',
LPAD(SUBSTRING_INDEX(SUBSTRI |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 003-010-001-001 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0,00 sec)
Cause "strand" column is text data, so it will be ordered in alphabetical. To make it be ordered as your desire, you should format your data before insert or update it. Suppose maximum digit for each level is 3, your data should be formated like this
003.001
003.001.001
003.001.001.001
003.002
003.002.001
003.002.001.001
003.010
010.001
The altenative way is splitting "strand" column into mutiple columns. Each column will store data for each level, such as
Level1 | Level2 | Level3 ...
3 | 1 | 0
3 | 1 | 1
3 | 2 | 0
...
3 | 10 | 0
Datatype of these columns should be number and then you should be able to order by these columns.
If the point(.) in your data is no more than 3, you can try this:
select *
from demo
order by replace(Strand, '.', '') * pow(10, (3 + length(replace(Strand, '.', '')) - length(Strand)))
If the point is uncertain, here you can should use subquery to get max num of point:
select demo.Strand
from demo
cross join (
select max(length(Strand) - length(replace(Strand, '.', ''))) as num from demo
) t
order by replace(Strand, '.', '') * pow(10, (num + length(replace(Strand, '.', '')) - length(Strand)))
See demo in Rextester.
As you see, I've used function replace, length, pow in order by clause.
1) replace(Strand, '.', '') will give us int number, like:
replace('3.10.1.1', '.', '') => 31011;
2) (3 + length(replace(Strand, '.', '')) - length(Strand)) will give us the count of point which the max num of point minus point's count in Strand, like:
3.1 => 2;
3)pow returns the value of X raised to the power of Y;
so the sample data will be calculated like:
3100
3110
3111
3112
3120
3121
31000
31010
31011
3200
3210
3211
3900
3911
by these nums, you will get the right sort.

How to import a text file with no delimiters with 2 lines representing a case

I have several text files that I need to import into MySQL, but they don't have any delimiters, and 3 lines in the text file represent one record.
When I try to import it everything goes into one column. Please see an example below
00003461020000001ACH1 00000000 00000000000 00000000 000000005011025708084 0 00 00 000000000000000000000 00000000241523551MA00
You need a helper table first.
CREATE TABLE tmpHelperTable(
your_data varchar(255),
a int,
b int
);
Then you need two user defined variables while loading your data.
SET #va = 0;
SET #vb = 0;
LOAD DATA INFILE 'your_data_file.csv'
INTO tmpHelperTable
LINES TERMINATED BY '\n'
(your_data, a, b)
SET a = #va := IF(#va = 3, 1, #va + 1),
b = IF(#va % 3 = 0, #vb := #vb + 1, #vb);
This line
SET a = #va := IF(#va = 3, 1, #va + 1),
is just an incrementing value, that resets when it reaches 3 (or whatever many lines determine one case).
The line
b = IF(#va = 1, #vb := #vb + 1, #vb);
just increments its value every time the previous variable got reset. We need this so we can group by it. Then you have a table like this:
your_data | a | b
xxxxxx 1 1
yyyyyy 2 1
zzzzzz 3 1
aaaaaa 1 2
bbbbbb 2 2
cccccc 3 2
dddddd 1 3
...
Then all you have to do is to pivot the table into your final table.
CREATE TABLE final_table(
id int,
data_1 varchar(255),
data_2 varchar(255),
data_3 varchar(255)
);
INSERT INTO final_table
SELECT
b,
MAX(IF(a = 1, your_data, NULL)),
MAX(IF(a = 2, your_data, NULL)),
MAX(IF(a = 3, your_data, NULL)),
FROM
tmpHelperTable
GROUP BY b;

Sort values that contain letters and symbols in a custom order

Can you change the MySQL sort by function? I am trying to sort my values according to an arbitrary order.
Currently looking for ways to inject a function that might help me out here short of adding a column and modifying the import.
This is the order I want:
AAA
AA+
AA
AA-
A+
A
A-
BBB+
BBB
BBB-
BB+
BB
BB-
B+
B
B-
CCC+
CCC
CCC-
CC
This is my result using sort by:
A
A+
A-
AA
AA+
AA-
AAA
B
B+
B-
BB
BB+
BB-
BBB
BBB+
BBB-
C
CC
CCC
CCC+
CCC-
EDIT:
Attempting but getting syntax errors:
CREATE FUNCTION sortRating (s CHAR(20))
RETURNS INT(2)
DECLARE var INT
CASE s
WHEN 'AAA' THEN SET var = 1
WHEN 'AA+' THEN SET var = 2
ELSE
SET VAR = 3
END CASE
RETURN var
END;
This is possible using the following syntax:
ORDER BY FIELD(<field_name>, comma-separated-custom-order)
for instance, if the expression you want to order by is called rating, then your ORDER BY clause would read:
ORDER BY FIELD(rating, 'AAA', 'AA+', 'AA', 'AA-', 'A+', 'A', 'A-',
'BBB+', 'BBB', 'BBB-', 'BB+', 'BB', 'BB-',
'B+', 'B', 'B-', 'CCC+', 'CCC', 'CCC-', 'CC')
Here's documentation on the FIELD FUNCTION
I see a pattern here:
BBB+
BBB
BBB-
BB+
BB
BB-
B+
B
B-
Think of each character as a column and sort each column in this order:
Letters
+
empty string
-
SELECT rating
FROM test
ORDER BY
MID(rating, 1, 1),
CASE MID(rating, 2, 1) WHEN '+' THEN 2 WHEN '' THEN 3 WHEN '-' THEN 4 ELSE 1 END,
CASE MID(rating, 3, 1) WHEN '+' THEN 2 WHEN '' THEN 3 WHEN '-' THEN 4 ELSE 1 END,
CASE MID(rating, 4, 1) WHEN '+' THEN 2 WHEN '' THEN 3 WHEN '-' THEN 4 ELSE 1 END
SQL Fiddle

PostgreSQL function with a loop

I'm not good at postgres functions. Could you help me out?
Say, I have this db:
name | round |position | val
-----------------------------------
A | 1 | 1 | 0.5
A | 1 | 2 | 3.4
A | 1 | 3 | 2.2
A | 1 | 4 | 3.8
A | 2 | 1 | 0.5
A | 2 | 2 | 32.3
A | 2 | 3 | 2.21
A | 2 | 4 | 0.8
I want to write a Postgres function that can loop from position=1 to position=4 and calculate the corresponding value. I could do this in python with psycopg2:
import psycopg2
import psycopg2.extras
conn = psycopg2.connect("host='localhost' dbname='mydb' user='user' password='pass'")
CURSOR = conn.cursor(cursor_factory=psycopg2.extras.DictCursor)
cmd = """SELECT name, round, position, val from mytable"""
CURSOR.execute(cmd)
rows = CURSOR.fetchall()
dict = {}
for row in rows:
indx = row['round']
try:
dict[indx] *= (1-row['val']/100)
except:
dict[indx] = (1-row['val']/100)
if row['position'] == 4:
if indx == 1:
result1 = dict[indx]
elif indx == 2:
result2 = dict[indx]
print result1, result2
How can I do the same thing directly in Postgres so that it returns a table of (name, result1, result2)
UPDATE:
#a_horse_with_no_name, the expected value would be:
result1 = (1 - 0.5/100) * (1 - 3.4/100) * (1 - 2.2/100) * (1 - 3.8/100) = 0.9043
result2 = (1 - 0.5/100) * (1 - 32.3/100) * (1 - 2.21/100) * (1 - 0.8/100) = 0.6535
#Glenn gave you a very elegant solution with an aggregate function. But to answer your question, a plpgsql function could look like this:
Test setup:
CREATE TEMP TABLE mytable (
name text
, round int
, position int
, val double precision
);
INSERT INTO mytable VALUES
('A', 1, 1, 0.5)
, ('A', 1, 2, 3.4)
, ('A', 1, 3, 2.2)
, ('A', 1, 4, 3.8)
, ('A', 2, 1, 0.5)
, ('A', 2, 2, 32.3)
, ('A', 2, 3, 2.21)
, ('A', 2, 4, 0.8)
;
Generic function
CREATE OR REPLACE FUNCTION f_grp_prod()
RETURNS TABLE (name text
, round int
, result double precision)
LANGUAGE plpgsql STABLE AS
$func$
DECLARE
r mytable%ROWTYPE;
BEGIN
-- init vars
name := 'A'; -- we happen to know initial value
round := 1; -- we happen to know initial value
result := 1;
FOR r IN
SELECT *
FROM mytable m
ORDER BY m.name, m.round
LOOP
IF (r.name, r.round) <> (name, round) THEN -- return result before round
RETURN NEXT;
name := r.name;
round := r.round;
result := 1;
END IF;
result := result * (1 - r.val/100);
END LOOP;
RETURN NEXT; -- return final result
END
$func$;
Call:
SELECT * FROM f_grp_prod();
Result:
name | round | result
-----+-------+---------------
A | 1 | 0.90430333812
A | 2 | 0.653458283632
Specific function as per question
CREATE OR REPLACE FUNCTION f_grp_prod(text)
RETURNS TABLE (name text
, result1 double precision
, result2 double precision)
LANGUAGE plpgsql STABLE AS
$func$
DECLARE
r mytable%ROWTYPE;
_round integer;
BEGIN
-- init vars
name := $1;
result2 := 1; -- abuse result2 as temp var for convenience
FOR r IN
SELECT *
FROM mytable m
WHERE m.name = name
ORDER BY m.round
LOOP
IF r.round <> _round THEN -- save result1 before 2nd round
result1 := result2;
result2 := 1;
END IF;
result2 := result2 * (1 - r.val/100);
_round := r.round;
END LOOP;
RETURN NEXT;
END
$func$;
Call:
SELECT * FROM f_grp_prod('A');
Result:
name | result1 | result2
-----+---------------+---------------
A | 0.90430333812 | 0.653458283632
I guess you are looking for an aggregate "product" function. You can create your own aggregate functions in Postgresql and Oracle.
CREATE TABLE mytable(name varchar(32), round int, position int, val decimal);
INSERT INTO mytable VALUES('A', 1, 1, 0.5);
INSERT INTO mytable VALUES('A', 1, 2, 3.4);
INSERT INTO mytable VALUES('A', 1, 3, 2.2);
INSERT INTO mytable VALUES('A', 1, 4, 3.8);
INSERT INTO mytable VALUES('A', 2, 1, 0.5);
INSERT INTO mytable VALUES('A', 2, 2, 32.3);
INSERT INTO mytable VALUES('A', 2, 3, 2.21);
INSERT INTO mytable VALUES('A', 2, 4, 0.8);
CREATE AGGREGATE product(double precision) (SFUNC=float8mul, STYPE=double precision, INITCOND=1);
SELECT name, round, product(1-val/100) AS result
FROM mytable
GROUP BY name, round;
name | round | result
------+-------+----------------
A | 2 | 0.653458283632
A | 1 | 0.90430333812
(2 rows)
See "User-Defined Aggregates" in the Postgresql doc. The example above I borrowed from
here. There are other stackoverflow responses that show other methods to do this.

How to get an ordered list of rows within 0 value with sql?

For example:
SELECT * FROM atable ORDER BY num;
'atable' is:
num name
1 a
3 y
0 cc
2 fs
The result is:
num name
1 a
2 fs
3 y
0 cc
But I want it to be:
num name
0 cc
1 a
2 fs
3 y
I can't reproduce the result you are seeing. The query that you posted should work as you wish it to. Here's my steps to reproduce:
CREATE TABLE atable (num INT NOT NULL, name NVARCHAR(100) NOT NULL);
INSERT INTO atable (num, name) VALUES
(1, 'a'),
(3, 'y'),
(0, 'cc'),
(2, 'fs');
SELECT * FROM atable ORDER BY num;
Result:
0, 'cc'
1, 'a'
2, 'fs'
3, 'y'
Perhaps you could post your create scripts for your table and test data in your question so that we can reproduce your result?
Are you sure that the 0 isn't a null value being displayed as a 0? Nulls can sort either at the top or the bottom, depending on database setting.
SELECT * FROM atable
ORDER BY ISNULL(CAST(num as int), 0);