

Res = conn.exec("select * from foos_and_bars where foo_id = # TEST_IDS = #randomly selected 100 ids in RĬonn = PGconn.open(:dbname => 'test_foo') Now we’ll make 100 queries by foo_id with this index, and then repeat with the single index installed using this code: Test_foo=# explain analyze select * from foos_and_bars where foo_id = 123 īitmap Heap Scan on foos_and_bars (cost=4.68.55.74 rows=13 width=12) (actual time=0.026.0.038 rows=8 loops=1) Run a simple query to make sure the index is used: Then, using R, we’ll create 3 million rows of nicely distributed data:ĭata = ame(foo_id = sample(foo_ids, rows,TRUE), bar_id= sample(bar_ids,rows,TRUE))ĭump that to a text file and load it up with copy and we’re good to go. Life is full of tradeoffs performance wise, so we should explore just how much slower it is to use a multi-column index for single column queries.ĬONSTRAINT foos_and_bars_pkey PRIMARY KEY (id) For queries involving only x, the multicolumn index could be used, though it would be larger and hence slower than an index on x alone A combination of the multicolumn index and a separate index on y would serve reasonably well. This index would typically be more efficient than index combination for queries involving both columns, but as discussed in Section 11.3, it would be almost useless for queries involving only y, so it should not be the only index. You could also create a multicolumn index on (x, y). If you click around that section of the docs, you’ll surely come across the section on multi-column indexing and performance, in particular this section (bold emphasis mine): First a pointer to the postgres docs that I can never find, and then data on performance of multi-column indexes vs single.Ī multicolumn B-tree index can be used with query conditions that involve any subset of the index’s columns, but the index is most efficient when there are constraints on the leading (leftmost) columns. Postgres will use the multi-column index for queries on the first column. If you have a table with a column included as the first column in a multi-column index and then again with it’s own index, you may be over indexing.
