After working for hours on different parameters and tuning the database, including setting up a spare server to see if a dump and restore fixes the issues I got the time down to 5 minutes.
Sounds good?
well not really since I needed an execution in the range of a couple of seconds. So I actually looked at the sql statement which caused issues and found a gigantic error in it...
the statement:
select * from BIN where bin_id not in ( SELECT bin_id from SPECTRA where sample_id = 424105 AND SPECTRA.bin_id = BIN.bin_id)
and the correct statement would be:
select * from BIN a where a.bin_id not in ( SELECT bin_id from SPECTRA where sample_id = 424105 and bin_id is not null) ORDER BY retention_index ASC
the problem?
SPECTRA.bin_id = BIN.bin_id
this little part is totally wrong in the whole statement and causes a seq scan over a table with 70 Million entries against a table with 6000 entries. It does nothing for the queries and I guess it was left over from a copy/paste accident 6 years ago when I wrote this script originally.
let's compare some numbers:
explain before:
Seq Scan on bin (cost=0.00..459513.42 rows=3044 width=1290)
SubPlan
Filter: (NOT (subplan))
-> Bitmap Heap Scan on spectra (cost=75.20..75.42 rows=2 width=8)
Recheck Cond: ((sample_id = 424105::double precision) AND (bin_id = ($0)::double precision))
-> BitmapAnd (cost=75.20..75.20 rows=2 width=0)
-> Bitmap Index Scan on spectra_sample_id (cost=0.00..18.99 rows=5696 width=0)
-> Bitmap Index Scan on spectra_binid (cost=0.00..56.18 rows=17658 width=0)
Index Cond: (sample_id = 424105::double precision)
Index Cond: (bin_id = ($0)::double precision)
explain after:
Sort (cost=1765.47..1766.23 rows=3044 width=1290)
Sort Key: retention_index
-> Seq Scan on bin a (cost=647.21..1747.86 rows=3044 width=1290)
Filter: (NOT (hashed subplan))
SubPlan
-> Index Scan using spectra_sample_id on spectra (cost=0.00..646.42 rows=3138 width=8)
Index Cond: (sample_id = 424105::double precision)
Filter: (bin_id IS NOT NULL)
in real world number this translates to 24 hours before statement compared to 4 seconds after statement...
...