Here's an example dataset that I'm dealing with:
+----+-----+-----+-----+-----+
| id | a | b | c | d |
+----+-----+-----+-----+-----+
| 1 | 1 | | | |
| 2 | | 2 | | |
| 3 | | | | |
| 4 | | | | 4 |
| 5 | | 3 | | |
+----+-----+-----+-----+-----+
I want to select the bottom-most values. If this value has never been set, then I'd want "null", otherwise, I want the bottom-most result. In this case, I'd want the resultset:
+-----+-----+-----+-----+
| a | b | c | d |
+-----+-----+-----+-----+
| 1 | 3 | | 4 |
+-----+-----+-----+-----+
I tried queries such as variations of:
SELECT DISTINCT `a`,`b`,`c`,`d`
FROM `test`
WHERE `a` IS NOT NULL
AND `b` IS NOT NULL
AND `c` IS NOT NULL
AND `d` IS NOT NULL
ORDER BY 'id' DESC LIMIT 1;
This didn't work.
Would I have to run queries for each value individually, or is there a way to do it within just that one query?
If you are OK with changing type to a char, you can do this:
SELECT substring_index(GROUP_CONCAT(a),',',1) as LastA,
substring_index(GROUP_CONCAT(b),',',1) as LastB,
substring_index(GROUP_CONCAT(c),',',1) as LastC,
substring_index(GROUP_CONCAT(d),',',1) as LastD
FROM
(
SELECT id, a, b, c, d
FROM MyTable
ORDER BY id DESC
) x;
SqlFiddle here
Notes:
The intermediate derived table is needed as the input to GROUP_CONCAT needs to be ordered.
After compressing the rows with GROUP_CONCAT (using the default comma delimiter), we then scrape out the first column with substring_index. substring_index on NULL returns NULL, as required.
If you need the resultant columns to be INT, you'll need to cast each column again.
Related
I've read MySQL - UPDATE query based on SELECT Query and am trying to do something similar - i.e. run an UPDATE query on a table and populate it with the results from a SELECT.
In my case the table I want to update is called substances and has a column called cas_html which is supposed to store CAS Numbers (chemical codes) as a HTML string.
Due to the structure of the database I am running the following query which will give me a result set of the substance ID and name (substances.id, substances.name) and the CAS as a HTML string (cas_values which comes from cas.value):
SELECT s.`id`, GROUP_CONCAT(c.`value` ORDER BY c.`id` SEPARATOR '<br>') cas_values, GROUP_CONCAT(s.`name` ORDER BY s.`id`) substance_name FROM substances s LEFT JOIN cas_substances cs ON s.id = cs.substance_id LEFT JOIN cas c ON cs.cas_id = c.id GROUP BY s.id;
Sample output:
id | cas_values | substance_name
----------------------------------------
1 | 133-24<br> | Chemical A
455-213<br>
21-234
-----|----------------|-----------------
2 999-23 | Chemical B
-----|----------------|-----------------
3 | | Chemical C
-----|----------------|-----------------
As you can see the cas_values column contains the HTML string (which may also be an empty string as in the case of "Chemical C"). I want to write the data in the cas_values column into substances.cas_html. However I can't piece together how to do this because other posts I'm reading get the data for the UPDATE in one column - I have other columns returned by my SELECT query.
Essentially the problem is that in my "sample output" table above I have 3 columns being returned. Other SO posts seem to have just 1 column being returned which is the actual values that are used in the UPDATE query (in this case on the substances table).
Is this possible?
I am using MySQL 5.5.56-MariaDB
These are the structures of the tables, if this helps:
mysql> DESCRIBE substances;
+-------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-----------------------+------+-----+---------+----------------+
| id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment |
| app_id | varchar(8) | NO | UNI | NULL | |
| name | varchar(1500) | NO | | NULL | |
| date | date | NO | | NULL | |
| cas_html | text | YES | | NULL | |
+-------------+-----------------------+------+-----+---------+----------------+
4 rows in set (0.01 sec)
mysql> DESCRIBE cas;
+-------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------------------+------+-----+---------+----------------+
| id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment |
| value | varchar(13) | NO | UNI | NULL | |
+-------+-----------------------+------+-----+---------+----------------+
2 rows in set (0.01 sec)
mysql> DESCRIBE cas_substances;
+--------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-----------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| cas_id | mediumint(8) unsigned | NO | MUL | NULL | |
| substance_id | mediumint(8) unsigned | NO | MUL | NULL | |
+--------------+-----------------------+------+-----+---------+----------------+
3 rows in set (0.02 sec)
Try something like this :
UPDATE substances AS s,
(
SELECT s.`id`,
GROUP_CONCAT(c.`value` ORDER BY c.`id` SEPARATOR '<br>') cas_values,
GROUP_CONCAT(s.`name` ORDER BY s.`id`) substance_name
FROM substances s
LEFT JOIN cas_substances cs ON s.id = cs.substance_id
LEFT JOIN cas c ON cs.cas_id = c.id
GROUP BY s.id
) AS t
SET s.cas_html=t.cas_values
WHERE s.id = t.id
If you don't want to modify all the value, the best way to limit the update to test it, is to add a condition in the where, something like that :
...
WHERE s.id = t.id AND s.id = 1
I'm working on how to implement a leaderboard. What I'd like to do is be able to sort the table by several different filters(score,number of submissions, average). The table might look like this.
+--------+-----------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+-----------------------+------+-----+---------+-------+
| userID | mediumint(8) unsigned | NO | PRI | 0 | |
| score | int | YES | MUL | NULL | |
| numSub | int | YES | MUL | NULL | |
+--------+-----------------------+------+-----+---------+-------+
And a sample set of data like so:
+--------+----------+--------+
| userID | score | numSub |
+--------+----------+--------+
| 505610 | 1245 | 2 |
| 544222 | 1458 | 2 |
| 547278 | 245 | 1 |
| 659241 | 12487 | 8 |
| 681087 | 5487 | 3 |
+--------+----------+--------+
My queries will be coming from PHP.
// get the top 100 scores
$q = "select userID, score from table order by score desc limit 0, 100";
this will return a set of userID/score sorted highest score first
I also have a query to sort by numSub (number of submissions)
What I would like is to sort the table by the avg score that being score/numSub; The table could be large so efficiency is important to me.
Thanks in advance!
If efficiency is important, then add a column avgscore and assign it the value of score/numsub. Then, create an index on the column.
You can use an insert/update trigger to do the average calculation automatically when a row is added or modified.
Once your tables gets large, the sort is going to take a noticeable amount of time.
As far as I can see, there's no reason to make it more complicated than this;
SELECT userID, score/numsub AS average_score
FROM Table1
ORDER BY score/numsub DESC;
I'm new to pivoting, so I came here to get some advice on this. I have a table with fields benchmarkname and value. However, another table is populated differently and out of my control: it has each benchmarkname as its own field in the table, with the row value being the value. The layout is below:
Table 1
+-----------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+-------+
| stream | double | YES | | NULL | |
| pisec | double | YES | | NULL | |
| iozws | double | YES | | NULL | |
| iozwb | double | YES | | NULL | |
| iozrs | double | YES | | NULL | |
| iozrb | double | YES | | NULL | |
Table 2
| BenchmarkName | varbinary(43) | YES | | NULL | |
| Value | decimal(14,0) | YES | | NULL | |
My question is: how do I convert the first table to look like second dynamically? I believe the answer lies in a pivot, but I am unsure.
I think you want to unpivot the first table. UNPIVOTing takes the data from your columns and converts it into rows. MySQL does not have unpivot so you will have to use a UNION ALL query:
select 'stream' BenchmarkName, stream value
from table1
union all
select 'pisec' BenchmarkName, pisec value
from table1
union all
select 'iozws' BenchmarkName, iozws value
from table1
union all
select 'iozwb' BenchmarkName, iozwb value
from table1
union all
select 'iozrs' BenchmarkName, iozrs value
from table1
union all
select 'iozrb' BenchmarkName, iozrb value
from table1
Is MySQL giving me grief because the nested SELECT in the insert statement uses the COUNT(*) function instead of selecting an actual column? So, what's the workaround?
Here's the story:
mysql> explain test;
+----------+----------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+----------------------+------+-----+---------+-------+
| language | varchar(50) | YES | | NULL | |
| count | smallint(5) unsigned | YES | | NULL | |
+----------+----------------------+------+-----+---------+-------+
2 rows in set (0.00 sec)
mysql> SELECT languages.name, COUNT(*) AS `total` FROM languages JOIN events ON languages.id = events.language_id GROUP BY name HAVING total > 250 ORDER BY total DESC;
+-----------+-------+
| name | total |
+-----------+-------+
| Spanish | 60079 |
| Foochow | 2838 |
| Mandarin | 2396 |
| Russian | 1675 |
| Arabic | 1410 |
| Cantonese | 1358 |
| Korean | 736 |
| French | 531 |
| Punjabi | 426 |
| Urdu | 408 |
| Hebrew | 276 |
| Pashto | 255 |
+-----------+-------+
12 rows in set (0.00 sec)
mysql> INSERT INTO test (`language`,`count`) VALUES ((SELECT languages.`name`, COUNT(*) AS `total` FROM languages JOIN events ON languages.id = events.language_id GROUP BY name HAVING total > 250 ORDER BY total DESC));
ERROR 1136 (21S01): Column count doesn't match value count at row 1
thanks.
MySQL doesn't support this sort of multiple-column-returning subquery, so the error message you're seeing is because the VALUES clause only contains one subquery, which is perforce (in a sense) only one column.
To fix it, you can skip the VALUES syntax, and just write:
INSERT
INTO test (`language`,`count`)
SELECT languages.`name`, COUNT(*) AS `total`
FROM languages
JOIN events
ON languages.id = events.language_id
GROUP
BY name
HAVING total > 250
ORDER
BY total DESC
;
(See ยง13.2.5.1 "INSERT ... SELECT Syntax" in the MySQL 5.6 Reference Manual.)
INSERT INTO test (`language`,`count`)
should be
INSERT INTO test (language,count)
I have a table like this:
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| v1 | int(11) | YES | MUL | NULL | |
| v2 | int(11) | YES | MUL | NULL | |
+-------+---------+------+-----+---------+-------+
There is a tremendous amount of duplication in this table. For instance, elements like the following:
+------+------+
| v1 | v2 |
+------+------+
| 1 | 2 |
| 1 | 3 |
| 1 | 4 |
| 1 | 5 |
| 1 | 6 |
| 1 | 7 |
| 1 | 8 |
| 1 | 9 |
| 2 | 1 |
| 4 | 1 |
| 5 | 1 |
| 6 | 1 |
| 7 | 1 |
| 8 | 1 |
| 9 | 1 |
+------+------+
The table is large with 1540000 entries. To remove the redundant entries (i.e. to get a table having only (1,9) and no (9,1) entries), I was thinking of doing it with a subquery but is there a better way of doing this?
Actually, #Mark's approach will work too. I just figured out another way of doing it and was wondering if I can some feedback on this as well. I tested it and it seems to work fast.
SELECT v1,v2 FROM table WHERE v1<v2 UNION SELECT v2,v1 FROM table WHERE v1>v2;
In the case where this is right, you can always create a new table:
CREATE TABLE newtable AS SELECT v1,v2 FROM edges WHERE v1<v2 UNION SELECT v2,v1 FROM edges WHERE v1>v2;
Warning: these commands modify your database. Make sure you have a backup copy so that you can restore the data again if necessary.
You can add the requirement that v1 must be less than v2 which will cut your storage requirement roughly in half. You can make sure all the rows in the database satisfy this condition and reorder those that don't and delete one of the rows when you have both.
This query will insert any missing rows where you have for example (5, 1) but not (1, 5):
INSERT INTO table1
SELECT T1.v2, T1.v1
FROM table1 T1
LEFT JOIN table1 T2
ON T1.v1 = T2.v2 AND T1.v2 = T2.v1
WHERE T1.v1 > T1.v2 AND T2.v1 IS NULL
Then this query deletes the rows you don't want, like (5, 1):
DELETE table1 WHERE v1 > v2
You might need to change other places in your code that were programmed before this constraint was added.