I have a recipe table, called recipes. There is the IDRecipe field and other parameters of the recipe except the categories. Categories are multi dimensional, so I have another table that connects one to many with one recipe. It is called category table (table 1 below). As you will see below, one recipe can have multiple categories in multiple dimensions. So I have another table (table 2) that describes the categories and dimensions, also below:
-- Table 1
CREATE TABLE `recepti_kategorije` (
`IDRecipe` int(11) NOT NULL,
`IDdimenzija` int(11) NOT NULL,
`IDKategorija` int(11) NOT NULL,
KEY `Iskanje` (`IDdimenzija`,`IDKategorija`,`IDRecipe`) USING BTREE,
KEY `izvlecek_recept` (`IDdimenzija`,`IDRecipe`),
KEY `IDRecipe` (`IDRecept`,`IDdimenzija`,`IDKategorija`) USING BTREE,
KEY `kategorija` (`IDKategorija`,`IDdimenzija`,`IDRecipe`) USING BTREE
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_slovenian_ci;
INSERT INTO `recepti_kategorije` VALUES
(1,1,1),
(1,1,2),
(1,2,3),
(1,3,2);
-- Table 2
CREATE TABLE `recipes_dimensions` (
`IDDimenzija` int(11) NOT NULL,
`IDKategorija` int(11) NOT NULL,
`Ime` char(50) COLLATE utf8_slovenian_ci NOT NULL,
KEY `IDDmenzija` (`IDDimenzija`,`IDKategorija`) USING BTREE,
KEY `IDKategorija` (`IDKategorija`,`IDDimenzija`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_slovenian_ci;
INSERT INTO `recipes_dimensions` VALUES
(1,1,'cheese'),
(1,2,'eggs'),
(1,3,'meat'),
(1,4,'vegetables'),
(2,1,'main dish'),
(2,2,'sweet'),
(2,3,'soup'),
(3,1,'summer'),
(3,2,'winter');
-- Table 3
CREATE TABLE `recepti_dimenzije_glavne` (
`IDDimenzija` int(11) NOT NULL,
`DimenzijaIme` char(50) COLLATE utf8_slovenian_ci DEFAULT NULL,
PRIMARY KEY (`IDDimenzija`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_slovenian_ci;
INSERT INTO `recepti_dimenzije_glavne` VALUES
(1,'ingredient'),
(2,'type'),
(3,'season');
Table 2 is the key table to find out the legend of each dimensions and each category.
So from this example we see that my recipe with ID1 has the tag: cheese and eggs from dimension 1 and is soup for winter season.
Now on my recipes page I need to get all this out to print the names of each dimension together with all the category names.
Ok, so there is another table, table 3, to get the names of the dimensions out:
Now what I need is a query that would get me at the same time for recipe ID=1 all the dimensions group concatenated with names, like:
ingredient: cheese, eggs | type: soup | season: winter
I tried doing a query for each of them in SELECT statement and it works, but I need 8 select queries (in total I have 8 dimensions, for the example I only wrote 3), my select query is:
SELECT
r.ID
(
SELECT
group_concat(ime SEPARATOR ', ')
FROM
recepti_kategorije rkat
JOIN recepti_dimenzije rd ON rd.IDKategorija = rkat.IDKategorija
AND rd.IDDimenzija = rkat.IDdimenzija
WHERE
rkat.IDRecipe = r.ID
AND rkat.IDDimenzija = 1
ORDER BY
ime ASC
) AS ingredient,
(
SELECT
group_concat(ime SEPARATOR ', ')
FROM
recepti_kategorije rkat
JOIN recepti_dimenzije rd ON rd.IDKategorija = rkat.IDKategorija
AND rd.IDDimenzija = rkat.IDdimenzija
WHERE
rkat.IDRecipe = r.ID
AND rkat.IDDimenzija = 2
ORDER BY
ime ASC
) AS type,
(
SELECT
group_concat(ime SEPARATOR ', ')
FROM
recepti_kategorije rkat
JOIN recepti_dimenzije rd ON rd.IDKategorija = rkat.IDKategorija
AND rd.IDDimenzija = rkat.IDdimenzija
WHERE
rkat.IDRecipe = r.ID
AND rkat.IDDimenzija = 3
ORDER BY
ime ASC
) AS season
FROM
recipes r
WHERE
r.ID = 1
That works, but it is somehow slow because the explain says it is searching like 6-8 rows each time and it is a long query and I don't get the names of the dimensions out because I need another join.
What would be optimal way to get all the dimensions separated into fields and concated with category names? I need to have this optimised as this is for one recipe presentation that happens each second, I can not fool around here. And whta indexes do I need so that this would be fast.
Something like below, not sure I typed the table/column names right or not, but should be easy to debug:
SELECT c.ID,GROUP_CONCAT(CONCAT(d.DimenzijaIme,': ',c.imes) SEPARATOR ' | ')
FROM (
SELECT
r.ID,rkat.IDDimenzija,
group_concat(rd.ime SEPARATOR ', ' ORDER BY rd.ime) AS imes
FROM recepti_kategorije rkat
JOIN recepti_dimenzije rd
ON rd.IDKategorija = rkat.IDKategorija
AND rd.IDDimenzija = rkat.IDdimenzija
INNER JOIN recipes r
ON r.ID=rkat.IDRecipe
GROUP BY r.ID,rkat.IDDimenzija) c
INNER JOIN recepti_dimenzije_glavne d
ON d.IDDimenzija=c.IDDimenzija
GROUP BY c.ID
Related
I am trying to optimize a mysql query that works perfectly but is taking way too long. My inventory table is nearly 300,000 records (not too bad). I am not sure if using a subquery or join or additional index would speed up my results. I do have the district_id columns indexed in both the students and inventory tables.
Basically, the query below pulls all the inventory of all students in a teacher's roster. So it first has to search the students table to find which students are in the teacher's roster, then has to search the inventory table for each student. So if a teacher has 30+ students it can be a lot of searches through the inventory and each student can have 30+ pieces of inventory. Any advice would be helpful!
SELECT inventory.inventory_id, items.title, items.isbn, items.item_num,
items.price, conditions.condition_name, inventory.check_out,
inventory.check_in, inventory.student_id, inventory.teacher_id
FROM inventory, conditions, items, students
WHERE students.teacher_id = '$teacher_id'
AND students.district_id = $district_id
AND inventory.student_id = students.s_number
AND inventory.district_id = $district_id
AND inventory.item_id = items.item_id
AND items.consumable !=1
AND conditions.condition_id = inventory.condition_id
ORDER BY inventory.student_id, inventory.inventory_id
Here is the table structure:
CREATE TABLE `inventory` (
`id` int(11) NOT NULL,
`inventory_id` varchar(10) CHARACTER SET utf8 NOT NULL DEFAULT '0',
`item_id` int(6) NOT NULL DEFAULT '0',
`district_id` int(2) NOT NULL DEFAULT '0',
`condition_id` int(1) NOT NULL DEFAULT '0',
`check_out` date NOT NULL DEFAULT '0000-00-00',
`check_in` date NOT NULL DEFAULT '0000-00-00',
`student_id` varchar(10) CHARACTER SET utf8 NOT NULL DEFAULT '0',
`teacher_id` varchar(6) CHARACTER SET utf8 NOT NULL DEFAULT '0',
`acquisition_date` date NOT NULL DEFAULT '0000-00-00',
`notes` text CHARACTER SET utf8 NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
First you rewrite this to use explicit JOINs:
SELECT inventory.inventory_id,
items.title, items.isbn, items.item_num, items.price,
conditions.condition_name,
inventory.check_out, inventory.check_in,
inventory.student_id, inventory.teacher_id
FROM inventory
JOIN conditions ON (conditions.condition_id = inventory.condition_id)
JOIN items ON (inventory.item_id = items.item_id AND items.consumable != 1)
JOIN students ON (inventory.student_id = students.s_number)
WHERE students.teacher_id = '$teacher_id'
AND students.district_id = $district_id
AND inventory.district_id = $district_id
ORDER BY inventory.student_id, inventory.inventory_id
Then you examine the JOINs. For example this:
JOIN items ON (inventory.item_id = items.item_id AND items.consumable != 1)
means that the items table needs to be scanned on item_id and consumable, which might be a constant. It is always better to not use negative conditions if possible. But at the very least you index items on item_id (unless it's already the primary key, as is likely). If consumable can assume, say, values 0, 1, 2, 3, then you go:
JOIN items ON (inventory.item_id = items.item_id AND items.consumable IN (0, 2, 3))
and use CREATE INDEX to add an index on consumable.
You may notice that a few columns from inventory are always used in the other JOINs, and there are also some constant constraints.
So another useful index could be
CREATE INDEX ... ON inventory(district_id, student_id, item_id, condition_id)
Another useful index would be
ON students(teacher_id, district_id, student_id, s_number)
which allows immediately restricting the WHERE on the involved students, and retrieve the information required by the JOINs without ever loading the table, just using the index.
Switch to InnoDB! Some of what I am about to say is less efficient in InnoDB.
SELECT i.inventory_id,
items.title, items.isbn, items.item_num, items.price,
c.condition_name,
i.check_out, i.check_in, i.student_id, i.teacher_id
FROM inventory AS i
JOIN conditions AS c ON c.condition_id = i.condition_id
JOIN items ON i.item_id = items.item_id
JOIN students AS s ON i.student_id = s.s_number
WHERE s.teacher_id = '$teacher_id'
AND s.district_id = $district_id
AND i.student_id = s.s_number
AND i.district_id = $district_id
AND items.consumable != 1
ORDER BY i.student_id, i.inventory_id
To help the Optimizer if it would like to start with students:
students: INDEX(district_id, teacher_id, s_number)
Note: this is also "covering", thereby avoiding bouncing between index BTree and data BTree. (What is the PK of students? Please provide SHOW CREATE TABLE.)
If consuming the ORDER BY is better:
inventory: INDEX(district_id, student_id, inventory_id)
Also needed:
items: (item_id) -- probably already the PRIMARY KEY?
conditions: (condition_id) -- probably already the PRIMARY KEY?
Verify or add those 4 indexes. (The Optimizer will dynamically choose what to do.)
I've database in which I'm storing japanese dictionary: words, readings, tags, types, meanings in other languages (english is the most important here, but there's also a few other) and so on.
Now, I want to create an interface using Datatables js plugin, so user could see table and use some filtering options (like, show only verbs, or find entries containing "dog"). I'm struggling, however, with query which can be pretty slow when using filtering... I already speed it up a lot, but it still not good.
This is my basic query:
select
v.id,
(
select group_concat(distinct vke.kanji_element separator '; ') from vocabulary_kanji_element as vke
where vke.vocabulary_id = v.id
) kanji_notation,
(
select group_concat(distinct vre.reading_element separator '; ') from vocabulary_reading_element as vre
where vre.vocabulary_id = v.id
) reading_notation,
(
select group_concat(distinct vsg.gloss separator '; ') from vocabulary_sense_gloss as vsg
join vocabulary_sense as vs on vsg.sense_id = vs.id
join language as l on vsg.language_id = l.id and l.language_code = 'eng'
where vs.vocabulary_id = v.id
) meanings,
(
select group_concat(distinct pos.name_code separator '; ') from vocabulary_sense as vs
join vocabulary_sense_has_pos as vshp on vshp.sense_id = vs.id
join part_of_speech as pos on pos.id = vshp.pos_id
where vs.vocabulary_id = v.id
) pos
from vocabulary as v
join vocabulary_sense as vs on vs.vocabulary_id = v.id
join vocabulary_sense_gloss as vsg on vsg.sense_id = vs.id
join vocabulary_kanji_element as vke on vke.vocabulary_id = v.id
join vocabulary_reading_element as vre on vre.vocabulary_id = v.id
join language as l on l.id = vsg.language_id and l.language_code = 'eng'
join vocabulary_sense_has_pos as vshp on vshp.sense_id = vs.id
join part_of_speech as pos on pos.id = vshp.pos_id
where
-- pos.name_code = 'n' and
(vsg.gloss like '%eat%' OR vke.kanji_element like '%eat%' OR vre.reading_element like '%eat%')
group by v.id
order by v.id desc
-- limit 3900, 25
Output is something like this:
|id | kanji_notation | reading_notation | meanings | pos |
---------------------------------------------------------------
|117312| お手; 御手 | おて | hand; arm |n; int|
Right now (working on my local machine), If there's no WHERE statement, but with limit, it works fast - about 0,140 sec. But when text filtering is on, execution time wents up to 6,5 sec, and often above. With filtering on part_of_speech first, its like 5,5 sec. 3 sec would be ok, but 6 is just way too long.
There's 1 155 897 records in table vocabulary_sense_gloss, so I think that's not a lot.
CREATE TABLE `vocabulary_sense_gloss` (
`id` MEDIUMINT(8) UNSIGNED NOT NULL AUTO_INCREMENT,
`sense_id` MEDIUMINT(8) UNSIGNED NOT NULL,
`gloss` VARCHAR(255) NOT NULL,
`language_id` MEDIUMINT(8) UNSIGNED NOT NULL,
PRIMARY KEY (`id`),
INDEX `vocabulary_sense_gloss_vocabulary_sense_id` (`sense_id`),
INDEX `vocabulary_sense_gloss_language_id` (`language_id`),
FULLTEXT INDEX `vocabulary_sense_gloss_gloss` (`gloss`),
CONSTRAINT `vocabulary_sense_gloss_language_id` FOREIGN KEY (`language_id`) REFERENCES `language` (`id`),
CONSTRAINT `vocabulary_sense_gloss_vocabulary_sense_id` FOREIGN KEY (`sense_id`) REFERENCES `vocabulary_sense` (`id`)
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB
;
I wonder, is there some way to optimize it? Or maybe should I change my database? I was trying to use fulltext search, but it's not much faster, and seems to work only on full terms, so its no use. Similiar story with using 'eat%' instead of '%eat%': it won't return what I want.
I tried to divide vocabulary_sense_gloss in two tables - one with english only terms, and other with the rest. Since users would use usually english anyway, it would make things faster, but I'm not sure if that's a good approach.
Also, I was trying to change VARCHAR to CHAR. It seemed to speed up execution time, though table size went up a lot.
This WHERE clause has extremely poor performance.
(vsg.gloss like '%eat%' OR
vke.kanji_element like '%eat%' OR
vre.reading_element like '%eat%')
Why? First of all: column LIKE '%constant%' requires the query engine to examine every possible value of column. It can't possibly use an index because of the leading % in the constant search term.
Second: the OR clause means the query planner has to scan the results three different times.
What are you going to do to improve this? It won't be easy. You need to figure out how to use column LIKE 'constant%' search terms eliminating the leading % from the constants.
Once you do that, you may be able to beat the triple scan of your vast joined result set with a construct like this
...
WHERE v.id IN
(SELECT sense_id AS id
FROM vocabulary_sense_gloss
WHERE gloss LIKE 'eat%'
UNION
SELECT vocabulary_id AS id
FROM vocabulary_kanji_element
WHERE kanji_element LIKE 'eat%'
UNION
SELECT vocabulary_id AS id
FROM vocabulary_reading_element
WHERE reading_element LIKE 'eat%'
)
This will pull the id numbers of the relevant words directly, rather than from the result of a multiway JOIN. For this to be fast, your vocabulary_sense_gloss will need an index on (vocabulary_sense_gloss, sense_id). The other two tables will need similar indexes.
I have 3 tables:
ITEMS ITEM_FILES_MAP FILES
id id id
name item_id filename
in_trash file_id
FILES has a one to many relationship with ITEMS trough the ITEM_FILES_MAP table.
I need a select query that returns a list of files by the following critera:
Only return files related to items where in_trash = 1
Avoid files that are related to items where in_trash = 0
Example:
ITEMS
id name in_trash
1 Item A 0
2 Item B 0
3 Item C 1
4 Item D 1
FILES
id filename
1 File A
2 File B
3 File C
4 File D
5 File E
ITEM_FILES_MAP
id item_id file_id
1 1 2
2 1 3
3 2 1
4 3 2
5 3 4
6 4 3
7 4 4
Desired result:
Returns File D (id 4).
File B, C and D (id 2,3,4 in FILES table) is due to be returned, but because File B and C are related to items where in_trash = 0, they will not be listed.
Here is a sample dump if you want to test out solutions:
CREATE TABLE `files` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`filename` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `files` (`id`, `filename`)
VALUES
(1,'File A'),
(2,'File B'),
(3,'File C'),
(4,'File D'),
(5,'File E');
CREATE TABLE `item_files_map` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`item_id` int(11) DEFAULT NULL,
`file_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `item_files_map` (`id`, `item_id`, `file_id`)
VALUES
(1,1,2),
(2,1,3),
(3,2,1),
(4,3,2),
(5,3,4),
(6,4,3),
(7,4,4);
CREATE TABLE `items` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`in_trash` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `items` (`id`, `name`, `in_trash`)
VALUES
(1,'Item A',0),
(2,'Item B',0),
(3,'Item C',1),
(4,'Item D',1);
Preparations
First, make sure you have an UNIQUE INDEX on fields item_id and file_id (in this order) on table item_files_map. No matter what query you run, if it includes this table the index will make the things fly instead or crawl. On some queries, however, an index having the fields in the opposite order would help more but for this task we need them in the presented order.
ALTER TABLE item_files_map
ADD UNIQUE INDEX item_file_id(`item_id`, `file_id`);
Also make sure you have an INDEX on items.in_trash.
ALTER TABLE items
ADD INDEX (`in_trash`);
For large tables it's possible that MySQL will ignore it if the ratio between 1 and 0 values is somewhere between 0.05 and 20 (if none of the values is used on less than 5% of the rows).
Probably the items having in_trash=1 are much fewer than those having in_trash=0 (or vice-versa) and this will convince MySQL to use the index for one of the instances of table items because the index removes a lot of rows from examination.
More, because the queries use only the fields PK and in_trash from this table, MySQL will use the index to get the information it needs and will not read the table data. And since the index is smaller than the table data, reading less bytes from the storage improves the execution speed.
The query, first attempt (following all the requirements)
A query that does what you need is:
# Query #1
SELECT DISTINCT f.id, f.filename
FROM items iit1
INNER JOIN item_files_map ifm1 ON iit1.id = ifm1.item_id
INNER JOIN files f ON f.id = ifm1.file_id
WHERE iit1.in_trash = 1
AND ifm1.file_id NOT IN (
SELECT ff.id
FROM files ff
INNER JOIN item_files_map ifm0 ON ff.id = ifm0.file_id
INNER JOIN items iit0 ON iit0.id = ifm0.item_id
WHERE iit0.in_trash = 0
);
Improving the query by slimming it
This query is not as good as can get and it can be improved if you are absolutely sure that table item_files_map does not contain orphan file_id values (i.e. values that cannot be found in column files.id). This should not happen on a well designed application and the database can help you avoid such situations by using FOREIGN KEY constraints (on InnoDB only).
Assuming this condition is met, we can remove table files from the inner query, making it simpler and faster:
# Query #2
SELECT DISTINCT f.id, f.filename
FROM items iit1
INNER JOIN item_files_map ifm1 ON iit1.id = ifm1.item_id
INNER JOIN files f ON f.id = ifm1.file_id
WHERE iit1.in_trash = 1
AND ifm1.file_id NOT IN (
SELECT ifm0.file_id
FROM item_files_map ifm0
INNER JOIN items iit0 ON iit0.id = ifm0.item_id
WHERE iit0.in_trash = 0
);
This query will produce the correct results.
The final query (ignore some of the requirements but produces correct results)
Another optimization can be done by selecting only the file.id and get rid of the filename for now, will run another query to get it:
# Query #3
SELECT DISTINCT ifm1.file_id
FROM items iit1
INNER JOIN item_files_map ifm1 ON iit1.id = ifm1.item_id
WHERE iit1.in_trash = 1
AND ifm1.file_id NOT IN (
SELECT ifm0.file_id
FROM item_files_map ifm0
INNER JOIN items iit0 ON iit0.id = ifm0.item_id
WHERE iit0.in_trash = 0
);
You can change the last JOIN to:
INNER JOIN items iit0 FORCE INDEX(PRIMARY) ON iit0.id = ifm0.item_id
to force MySQL use the PK for that join but I cannot tell if it will run faster. Maybe when the table becomes bigger.
This query doesn't select the filename (because it doesn't access the files table at all). It can be easily fetched (together with other fields from table files or with fields selected from other joined tables) using a query that runs like the wind because it uses the table's PK to get the rows it needs:
# Query #3-extra
SELECT *
FROM files
WHERE id IN (1, 2, 3)
Replace 1, 2, 3 with the list of file IDs returned by the previous query.
For big tables, these two queries could run faster than Query #2
Remark
As explained in the previous section, Query #2 and Query #3 assume there are no orphan file_id entries in the item_files_map table. If such orphan entries exist Query #3 can return invalid file_id values but they will be filtered out by Query #3-extra and the final result set returned by it will contain only valid results.
I did not test in mysql but you could do something like this :
SELECT filename FROM
(SELECT filename, sum(in_trash) AS s, count(*) AS c
FROM items, files, item_files_map
WHERE items.id = item_files_map.item_id AND files.id = item_files_map.file_id
GROUP BY filename) sub
WHERE s = c
The subquery computes for each filename the count of items referencing it and the count of items in trash. For your example it returns :
"D" 2 2
"B" 1 2
"C" 1 2
"A" 0 1
If these counts are the same then only in trash items reference.
EDIT: Following the suggestions of axiac, here is the query:
SELECT filename, files.id, sum(in_trash) AS s, count(*) AS c
FROM items, files, item_files_map
WHERE items.id = item_files_map.item_id AND files.id = item_files_map.file_id
GROUP BY files.id
HAVING s = c
I have 3 tables:
CREATE TABLE IF NOT EXISTS `disksinfo` (
`idx` int(10) NOT NULL AUTO_INCREMENT,
`hostinfo_idx` int(10) DEFAULT NULL,
`id` char(30) DEFAULT NULL,
`name` char(30) DEFAULT NULL,
`size` bigint(20) DEFAULT NULL,
`freespace` bigint(20) DEFAULT NULL,
PRIMARY KEY (`idx`)
)
CREATE TABLE IF NOT EXISTS `hostinfo` (
`idx` int(10) NOT NULL AUTO_INCREMENT,
`host_idx` int(11) DEFAULT NULL,
`probetime` datetime DEFAULT NULL,
`processor_load` tinyint(4) DEFAULT NULL,
`memory_total` bigint(20) DEFAULT NULL,
`memory_free` bigint(20) DEFAULT NULL,
PRIMARY KEY (`idx`)
)
CREATE TABLE IF NOT EXISTS `hosts` (
`idx` int(10) NOT NULL AUTO_INCREMENT,
`name` char(30) DEFAULT '0',
PRIMARY KEY (`idx`)
)
Basicaly, hosts ist just fixed list of hostnames used in hostinfo table (hostinfo.host_idx = hosts.idx)
hostinfo is a table which is filled each few minutes with data from all hosts and in addition, for each hostinfo row at least one diskinfo row is created. Each diskinfo row contains informations about at least one disk (so, for some hosts there are 3-4 rows of diskinfo). diskinfo.hostinfo_idx = hostinfo.idx.
hostinfo.probetime is simply the time at which data snapshot was created.
What i want to perform now is to select last hostinfo (.probetime) for each particular distinct host (hostinfo.host_idx), while joing informations about disks (diskinfo table) and host names (hosts table)
I came with this:
SELECT hinfo.idx,
hinfo.host_idx,
hinfo.processor_load,
hinfo.memory_total,
hinfo.memory_free,
hnames.idx,
hnames.name,
disks.hostinfo_idx,
disks.id,
disks.name,
disks.size,
disks.freespace,
Max(hinfo.probetime)
FROM systeminfo.hostinfo AS hinfo
INNER JOIN systeminfo.hosts AS hnames
ON hnames.idx = hinfo.host_idx
INNER JOIN systeminfo.disksinfo AS disks
ON disks.hostinfo_idx = hinfo.idx
GROUP BY disks.id,
hnames.name
ORDER BY hnames.name,
disks.id
It seems to work! But, is it 100% correct? Is it optimal? Thanks for any tip!
It's not 100% correct, no.
Suppose you have this table:
x | y | z
-----------------
a b 1
a c 2
d e 1
d f 2
Now when you only group by x, the rows are collapsing and MySQL picks a random row from the collapsed ones. So you might get
x | y | z
-----------------
a b 2
d e 2
or this
x | y | z
-----------------
a c 2
d f 2
Or another combination, this is not determined. Each time you fire your query you might get a different result. The 2 in column z is always there, because of the MAX() function, but you won't necessarily get the corresponding row to it.
Other RDBMSs would actually do the same, but most forbid this by default (in can be forbidden in MySQL, too). You have two possibilities to fix this (actually there are more, but I'll restrict to two).
Either you put all columns you have in your SELECT clause which are not used in an aggregate function like SUM() or MAX() or whatever into the GROUP BY clause as well, like this:
SELECT hinfo.idx,
hinfo.host_idx,
hinfo.processor_load,
hinfo.memory_total,
hinfo.memory_free,
hnames.idx,
hnames.name,
disks.hostinfo_idx,
disks.id,
disks.name,
disks.size,
disks.freespace,
Max(hinfo.probetime)
FROM systeminfo.hostinfo AS hinfo
INNER JOIN systeminfo.hosts AS hnames
ON hnames.idx = hinfo.host_idx
INNER JOIN systeminfo.disksinfo AS disks
ON disks.hostinfo_idx = hinfo.idx
GROUP BY
hinfo.idx,
hinfo.host_idx,
hinfo.processor_load,
hinfo.memory_total,
hinfo.memory_free,
hnames.idx,
hnames.name,
disks.hostinfo_idx,
disks.id,
disks.name,
disks.size,
disks.freespace
ORDER BY hnames.name,
disks.id
Note that this query might get you a different result! I'm just focusing on the problem, that you might get wrong data to the row you think holds the MAX(hinfo.probetime).
Or you solve it like this (and this will get you what you want):
SELECT hinfo.idx,
hinfo.host_idx,
hinfo.processor_load,
hinfo.memory_total,
hinfo.memory_free,
hnames.idx,
hnames.name,
disks.hostinfo_idx,
disks.id,
disks.name,
disks.size,
disks.freespace,
hinfo.probetime
FROM systeminfo.hostinfo AS hinfo
INNER JOIN systeminfo.hosts AS hnames
ON hnames.idx = hinfo.host_idx
INNER JOIN systeminfo.disksinfo AS disks
ON disks.hostinfo_idx = hinfo.idx
WHERE hinfo.probetime = (SELECT MAX(probetime) FROM systeminfo.hostinfo AS hi
INNER JOIN systeminfo.hosts AS hn
ON hnames.idx = hinfo.host_idx
INNER JOIN systeminfo.disksinfo AS d
ON disks.hostinfo_idx = hinfo.idx
WHERE d.id = disks.id AND hn.name = hnames.name)
GROUP BY disks.id,
hnames.name
ORDER BY hnames.name,
disks.id
There's also a nice example in the manual about this: The Rows Holding the Group-wise Maximum of a Certain Column
I’ve implemented a closure table system in MySQL for a hierarchy group list.
The groups are in table company_groups with columns ID and Name
The closure table is company_groups_treepaths:
CREATE TABLE `company_groups` (
`id` char(36) NOT NULL default '',
`name` varchar(150) NOT NULL default '',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `company_groups_treepaths` (
`ParentID` char(36) NOT NULL default '',
`ChildID` char(36) NOT NULL default '',
`PathLength` int(11) NOT NULL default '0',
PRIMARY KEY (`ParentID`,`ChildID`),
KEY `PathLength` (`PathLength`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
And then I am trying to get a tree structure out of it. The problem is that most of the solutions I find is using group_concat on the group id, assuming it’s an INT and auto_increment.
However, I use GUID which makes it harder. I’ve looked through the other examples here, but can’t really get a hang of it.
For example, this query retrieves the right groups, but the wrong tree:
SELECT SQL_CALC_FOUND_ROWS p.`ChildID`, p.ParentID, d.name, CONCAT(REPEAT('-', p.`PathLength`), d.`name`) as path, p.`PathLength` as depth
FROM
`company_groups` AS d
JOIN `company_groups_treepaths` AS p ON d.`id` = p.`ChildID`
JOIN `company_groups_treepaths` AS crumbs ON crumbs.`ChildID` = p.`ChildID`
WHERE
p.`ParentID` = 'aa420c70-7050-11e2-b75d-672efc30777e'
GROUP BY d.id
ORDER BY GROUP_CONCAT(crumbs.`PathLength`)
SQL Fiddle here: http://sqlfiddle.com/#!2/474d4/2
The correct order for that query should be (fetching all children of Swedbank):
Swedbank (aa420c70-7050-11e2-b75d-672efc30777e)
hejsan (44b2b680-7f44-11e2-b04d-918fe8c8d065)
Östergötland (aa420970-7050-11e2-893a-7f63b55a76db)
Regional1 (a6adc800-7050-11e2-9db0-ad8ff41db08c)
asd (56fd15a0-7f44-11e2-b10f-55240ef76c28)
hejsan3 (fc14c320-7f44-11e2-a2bb-ed51f02fd80f)
Under öster (bb6b93a0-80ea-11e2-be1d-fd97d33aad97)
Småland (ae5dc150-7050-11e2-9b11-c96b3591816c)
asdasd (534e3f00-80df-11e2-b92e-fd29e414f3fd)
asd (6e640160-80de-11e2-8c41-d135d36c28db)
hejsan2 (d95a7060-80be-11e2-8179-0b9231964800)
Anyone got any good ideas for tree listing, using GUID?
The function itself won't be called very very often, so I'm fairly open for sub-query suggestions as well if it's necessary to solve the problem.
I reverted to trying out the basics, following outlines found on http://karwin.blogspot.se/2010/03/rendering-trees-with-closure-tables.html
This is the query that eventually worked:
select group_concat(n.name order by a.PathLength desc separator ' -> ') as fullpath, CONCAT(REPEAT('-', d.`PathLength`), cg.`name`) as path, d.ChildID as group_id, d.PathLength as depth, cg.name
from company_groups_treepaths d
join company_groups_treepaths a on (a.ChildID = d.ChildID)
join company_groups n on (n.id = a.ParentId)
join company_groups cg on (cg.id = d.ChildID)
where d.ParentID = 'aa420c70-7050-11e2-b75d-672efc30777e' and cg.deleted = 0
group by d.ChildID
order by fullpath