Showing posts with label tuning. Show all posts
Showing posts with label tuning. Show all posts

Friday, March 5, 2010

Postgres - partial indexes and sort indexes

this morning it was a sad moment. Normally people ask me all the time about SQL optimization or SQL tuning or when to create an index and so on.

This morning i had another DOOOH moment.

Basically while trying to optimize BinBase even further I noticed that some queries take an absurd amount of time to finish. Like 500 seconds to query the bin table.

So I started to investigate and discovered that some of my queries are using seq scans over huge tables. For no apparent reason.

It turned out that the sort column was not indexed...


CREATE INDEX bin_retention_index ON bin (retention_index ASC NULLS LAST);


which was greatly improved after executing this statement.

Now there was still a huge wait time for a certain query


explain select * from BIN where bin_id not in ( SELECT bin_id from SPECTRA where sample_id = 1630733 and bin_id is not null )ORDER BY retention_index ASC


and it turned out that indexes were never used for the 'is not null' part. After some research and a lot of head scratching it turned out that postgres supports partial indexes for exactly this case.


create index spectra_bin_id_partial_is_not_null on spectra (bin_id) where bin_id is not null


Now afterward we some improvement but the actually slow data access is caused by something else. We always need the complete table - a few compounds and this operation takes some time.

Time to optimize the internal database lookup...

Wednesday, March 3, 2010

SetupX - tuning and the missing indexes

after analyzing a couple of hundred sql statements in setupx I noticed that there is no real use of indexes for some reason. Why there are no indexes escapes my mind, but since we prefer higher query speed I suggest the creation of the following indexes (which are far from perfect)

create index pubdata_relatedExperimentID_index on pubdata(relatedExperimentID)
create index NCBIClassifierID_ncbiID_index on NCBIClassifierID(ncbiID)
create index formobject_question_index on formobject(question)
create index formobject_discriminator_index on formobject(discriminator)
create index formobject_value_index on formobject(value(100))
create index query_querytype_index on query(querytype)
create index query_userid_index on query(userID)
create index datafile_source_index on datafile (source)
create index cache_experiment_id_index on cache (experimentID)
create index user_password_index on user (passwd)
create index user_username_index on user (username)
create index BlockedIP_blockedIp_index on BlockedIP(blockedIP)

this should improve the performance nicely and is currently being applied to our production system.