Geoserver creates slow PSQL queries - gis

I'm experiencing some slowness with Geoserver + PostGIS. Filtering a layer takes huge amounts of time for Geoserver to render the tiles.
For instance I have an array of IDs that need to be shown so I make a CQL_FILTER like
CQL_FILTER="id IN ('1', '2')"
I have checked the query logger in PSQL and the query looks like
SELECT "objectid",encode(ST_AsBinary(ST_Force2D("the_geom")),'base64') as "the_geom" FROM "public"."table" WHERE ((("id" = '1' AND "id" IS NOT NULL ) OR ("id" = '2' AND "id" IS NOT NULL )
Of course I need to query hundreds of IDs and the PSQL query takes about 30 seconds to finish. But if I just do this query:
SELECT "objectid",encode(ST_AsBinary(ST_Force2D("the_geom")),'base64') as "the_geom" FROM "public"."table" WHERE id IN ('1', '2')
the query finishes in ~1s. Is there any way to "optimize" the way GeoServer writes the queries?
I have tried optimizing the PSQL server by increasing cache sizes, optimizing for M2 SSD and prewarming
Thanks!
Edit:
Removing rows with where the ID is null, adding a NOT NULL statement to the column inside PSQL and reloading the layer inside GeoServer removed the "id" IS NOT NULL from the query and the query time is half what it used to be, but it's still not good enough. It really should be just a simple IN query.

Upgrading PSQL to 10.2 fixed the issue, the index is still not used on the OR query but it's 10 times faster.

Related

Mysql Index misbehave in rails

I have an rails app hosted with Mysql, there is a reservations table with index set in column rescheduled_reservation_id (nullable).
In my rails app there are two part to query reservation by rescheduled_reservation_id fields as below:
Transit::Reservation.find_by(rescheduled_reservation_id: 25805)
and produce the following log output:
Transit::Reservation Load (60.3ms) SELECT `transit_reservations`.* FROM `transit_reservations` WHERE `transit_reservations`.`deleted_at` IS NULL AND `transit_reservations`.`rescheduled_reservation_id` = 25805 LIMIT 1
However the other part of the app:
Transit::Reservation.where(rescheduled_reservation_id: 25805).last
with the log output belows
Transit::Reservation Load (2.3ms) SELECT `transit_reservations`.* FROM `transit_reservations` WHERE `transit_reservations`.`deleted_at` IS NULL AND `transit_reservations`.`rescheduled_reservation_id` = 25805 ORDER BY `transit_reservations`.`id` DESC LIMIT 1
As clearly seen the first query
Transit::Reservation Load (60.3ms) SELECT `transit_reservations`.* FROM `transit_reservations` WHERE `transit_reservations`.`deleted_at` IS NULL AND `transit_reservations`.`rescheduled_reservation_id` = 25805 LIMIT 1
took up to 60ms, the index might not have been used properly comparing to 2ms in this
Transit::Reservation Load (2.3ms) SELECT `transit_reservations`.* FROM `transit_reservations` WHERE `transit_reservations`.`deleted_at` IS NULL AND `transit_reservations`.`rescheduled_reservation_id` = 25805 ORDER BY `transit_reservations`.`id` DESC LIMIT 1
I also tried to debug further by running explain on both queries, I got back the same result i.e the index rescheduled_reservation_id being used
Is there anyone experiencing with this issue? I am wondering whether rails mysql connection ( I am using mysql2 gem ) might cause Mysql server to not choose the right index
It's Rare, but Normal.
The likely answer is that the first occurrence did not find the blocks it needed cached in the buffer_pool. So, it had to fetch them from disk. On a plain ole HDD, a Rule of Thumb is 10ms per disk hit. So, maybe there were 6 blocks that it needed to fetch, leading to 60.3ms.
Another possibility is that other activities were interfering, thereby slowing down this operation.
2.3ms is reasonable for a simple query like that the can be performed entirely with cached blocks in RAM.
Was the server recently restarted? After a restart, there is nothing in cache. Is the table larger than innodb_buffer_pool_size? If so, that would lead to 60ms happening sporadically -- blocks would get bumped out. (Caveat: The buffer_pool should not be made so big that 'swapping' occurs.)
A block is 16KB; it contains some rows of data or rows of index or nodes of a BTree. Depending on the size of the table, even that 'point query' might have needed to look at 6 blocks or more.
If you don't get 2.3ms most of the time, we should dig deeper. (I have hinted at sizes to investigate.)

INSERT that runs infinitely in MySQL

I have an odd problem that I can't figure out. I'm not much of a MySQL guy (more of a SQL Server person), and I have an INSERT statement that is running (seemingly) forever.
INSERT INTO voter_registration_v2.temp_address_map (person_id, address_id)
select person_id, address_id from
voter_registration.voters v
inner join voter_registration_v2.address a
on v.house_num = a.house_num
and v.half_code = a.half_code
and v.street_dir = a.street_dir
and v.street_name = a.street_name
and v.street_type_cd = a.street_type_cd
and v.street_sufx_cd = a.street_sufx_cd
and v.unit_designator = a.unit_designator
and v.unit_num = a.unit_num
and v.res_city_desc = a.res_city_desc
and v.state_cd = a.state_cd
and v.zip_code = a.zip_code;
The SELECT itself runs in 20s, 16s to fetch. When I run with the INSERT I've timed out at 6000s. All tables are using the MyISAM Engine. I attempted InnoDB originally but it didn't make a difference. It definitely is a large insert - about 600k records. Below is the CREATE for the temp table.
CREATE TABLE temp_address_map (
person_id int PRIMARY KEY,
address_id INT
);
However, even with 600k - I can't imagine an INSERT taking 100+ minutes if the SELECT only takes ~30s. Appreciate any suggestions.
I have noticed odd problems with my local installation of MySQL anyway. Some SELECT statements that take .5 seconds or less randomly began running forever as well. The ONLY way I could fix the problem was to uninstall and reinstall the server. I must have run through 100 suggestions on forums before I gave up. It's almost like MySQL gets progressively slower until it's unusable. (my RAM is around 48% used). Kind of odd, not sure that's what is going on here though...
You are correct. Under most circumstances, a select query that returns in 20s should not take hours for the insert. However, I would caution that you may be timing the select based on the "first row" that is returned. The insert doesn't return until all rows have been returned.
You have a very detailed on clause. I would suggest a composite index on all the columns used in the clause (starting from the most general to the least general):
create index idx_address_allkeys
on address(state_cd, res_city_desc, zip, street_name, . . . );
In other words, I am guessing that your code is using a nested loop join, returning one row at time.

MySQL update performance VERY bad

I'm having VERY bad performance with UPDATE on MySQL, my update statement is quite basic like:
UPDATE `tbl_name`
SET `field1` = 'value1', `field2` = 'value2' .. `fieldN` = 'valueN'
WHERE `tbl_name`.`id` = 123;
values are few (15), all TEXT kind and WHERE condition is just one using id.
values are JSON strings (but this should not bother to MySQL, it should see them as just plain text).
In "tbl_name" I have few records (around 4k).
The problem is that executing this UPDATE statement I got 8 seconds of execution time (taken from MySQL slow query log).
I'm running MySQL alone on an EC2 High CPU Medium istance and I think it's pretty impossible that these performances are "normal", I would expect much more performance.
Do you have any idea to investigate the problem?
** UPDATE **
Thank you for your fast answers, table is InnoDB and id is a PRIMARY, UNIQUE. Values are TEXT (not varchar)
** UPDATE bis **
No, id is an integer, all other fields are TEXT
Since MySQL do not support EXPLAIN UPDATE statements before version 5.6.3, we're quite blind about this query. Try USE INDEX statement...
I've launched the same on my server. All was ok with 15 TEXT fields and 4096 rows of quite arbitrary text. It was ok with both USE INDEX(PRIMARY) and IGNORE INDEX(PRIMARY) statements.
So, I suppose, you have problems with your SQL server, installation package, or whatever, not query...

Mysql Update performance suddenly abysmal

MySQL 5.1, Ubuntu 10.10 64bit, Linode virtual machine.
All tables are InnoDB.
One of our production machines uses a MySQL database containing 31 related tables. In one table, there is a field containing display values that may change several times per day, depending on conditions.
These changes to the display values are applied lazily throughout the day during usage hours. A script periodically runs and checks a few inexpensive conditions that may cause a change, and updates the display value if a condition is met. However, this lazy method doesn't catch all posible scenarios in which the display value should be updated, in order to keep background process load to a minimum during working hours.
Once per night, a script purges all display values stored in the table and recalculates them all, thereby catching all possible changes. This is a much more expensive operation.
This has all been running consistently for about 6 months. Suddenly, 3 days ago, the run time of the nightly script went from an average of 40 seconds to 11 minutes.
The overall proportions on the stored data have not changed in a significant way.
I have investigated as best I can, and the part of the script that is suddenly running slower is the last update statement that writes the new display values. It is executed once per row, given the (INT(11)) id of the row and the new display value (also an INT).
update `table` set `display_value` = ? where `id` = ?
The funny thing is, that the purge of all the previous values is executed as:
update `table` set `display_value` = null
And this statement still runs at the same speed as always.
The display_value field is not indexed. id is the primary key. There are 4 other foreign keys in table that are not modified at any point during execution.
And the final curve ball: If I dump this schema to a test VM, and execute the same script it runs in 40 seconds not 11 minutes. I have not attempted to rebuild the schema on the production machine, as that's simply not a long term solution and I want to understand what's happening here.
Is something off with my indexes? Do they get cruft in them after thousands of updates on the same rows?
Update
I was able to completely resolve this problem by running optimize on the schema. Since InnoDB doesn't support optimize, this forced a rebuild, and resolved the issue. Perhaps I had a corrupted index?
mysqlcheck -A -o -u <user> -p
There is a chance the the UPDATE statement won't use an index on id, however, it's very improbable (if possible at all) for a query like yours.
Is there a chance your table are locked by a long-running concurrent query / DML? Which engine does the table use?
Also, updating the table record-by-record is not efficient. You can load your values into a temporary table in a bulk manner and update the main table with a single command:
CREATE TEMPORARY TABLE tmp_display_values (id INT NOT NULL PRIMARY KEY, new_display_value INT);
INSERT
INTO tmp_display_values
VALUES
(?, ?),
(?, ?),
…;
UPDATE `table` dv
JOIN tmp_display_values t
ON dv.id = t.id
SET dv.new_display_value = t.new_display_value;

Slow select when inserting large amounts of data (MYSQL)

I have a process that imports a lot of data (950k rows) using inserts that insert 500 rows at a time. The process generally takes about 12 hours, which isn't too bad. Normally doing a query on the table is pretty quick (under 1 second) as I've put (what I think to be) the proper indexes in place. The problem I'm having is trying to run a query when the import process is running. It is making the query take almost 2 minutes! What can I do to make these two things not compete for resources (or whatever)? I've looked into "insert delayed" but not sure I want to change the table to MyISAM.
Thanks for any help!
Have you tried using priority hints?
SELECT HIGH_PRIORITY ... and INSERT LOW_PRIORITY ...
12 hours to insert 950k rows is pretty heavy duty. How big are these rows? What kind of indexes are on them? Even if the actual data insertion goes quickly, the continual updating of the indexes will definitely cause performance degradation for anything using those table(s) at the time.
Are you doing these imports with the bulk INSERT syntax (insert into tab (x) values (a), (b), (c), etc...) or one INSERT per row? Doing the bulk insert will require a longer index updating period (as it has to generate index data for 500 rows) than doing it for a single row. There will be no doubt be some sort of internal lock placed on the indexes while the data's updated, in which case you're contending with 950k/500 = 1,900 locking sessions at minimum.
I found that on some of my bulk-insert scripts (an http log analyzer for some custom data mining), it was quicker to DISABLE indexes on the relevant tables, then reenabling/rebuilding them after the data dump was completed. If I remember right, it was about 37 minutes to insert 200,000 rows of hit data with keys enabled, and about 3 minutes with no indexing.
So I finally found the slowdown while searching during the import of my data. I had one query like this:
SELECT * FROM `properties` WHERE (state like 'Florida%') and (county like 'Hillsborough%') ORDER BY created_at desc LIMIT 0, 50
and when I ran an EXPLAIN on it, I found out it was scanning around 215,000 rows (even with proper indexes on state and county in place). I then ran an EXPLAIN on the following query:
SELECT * FROM `properties` WHERE (state = 'Florida') and (county = 'Hillsborough') ORDER BY created_at desc LIMIT 0, 50
and saw that it only had to scan 500 rows. Considering that the actual result set was something like 350, I think I identified the slowdown.
I've made the switch to not using "like" in my queries and am very happy with the snappier results.
Thanks to everyone for your help and suggestions. They are much appreciated!
You can try import your data to some auxiliary table and then merge it into the main table. You don't lose performance in your main table, and I think your db can manage the merge much faster than the multiple insertions.