Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: support expression-based index columns #9682

Closed
knz opened this issue Oct 3, 2016 · 18 comments
Closed

sql: support expression-based index columns #9682

knz opened this issue Oct 3, 2016 · 18 comments
Assignees
Labels
A-sql-encoding Relating to the SQL/KV encoding. A-sql-semantics C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-sql-queries SQL Queries Team X-anchored-telemetry The issue number is anchored by telemetry references.

Comments

@knz
Copy link
Contributor

knz commented Oct 3, 2016

Computed indexes are indexes on the result of an expression (as opposed to the direct value of a column). For example: CREATE INDEX a ON customer( LOWER(firstname) || ' ' || LOWER(lastname))

Suggested by @petermattis

@knz knz added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Oct 3, 2016
@knz knz added the C-performance Perf of queries or internals. Solution not expected to change functional behavior. label Oct 27, 2016
@knz
Copy link
Contributor Author

knz commented Oct 27, 2016

Would help with #7186 too.

@petermattis petermattis added this to the Later milestone Feb 22, 2017
@knz knz added this to Triage in (DEPRECATED) SQL Front-end, Lang & Semantics via automation May 9, 2018
@knz knz moved this from Triage to Feature requests / pie-in-the-skie in (DEPRECATED) SQL Front-end, Lang & Semantics May 9, 2018
@knz knz added A-sql-semantics A-sql-encoding Relating to the SQL/KV encoding. labels May 15, 2018
@awoods187
Copy link
Contributor

This is one of the top unimplemented errors encountered by users

@awoods187 awoods187 reopened this Jun 4, 2018
(DEPRECATED) SQL Front-end, Lang & Semantics automation moved this from Feature requests / pie-in-the-skie to Triage Jun 4, 2018
@knz knz removed this from the Later milestone Jun 5, 2018
@knz knz moved this from Triage to Backlog in (DEPRECATED) SQL Front-end, Lang & Semantics Jun 5, 2018
@knz
Copy link
Contributor Author

knz commented Oct 3, 2018

Also requested by @rkruze in #30690:

CREATE UNIQUE INDEX IF NOT EXISTS table_unique_idx
ON test_table
USING btree
((data->>'code'), (data->>'name'), (data->>'email_address')) where data->>'is_active'= 'true';

@cevin
Copy link

cevin commented May 27, 2019

petition

@lopezator
Copy link
Contributor

We are hitting this also, some advance on this would be appreciated.

@jordanlewis
Copy link
Member

Moving to Optimizer backlog, cc @RaduBerinde @mgartner

@RaduBerinde
Copy link
Member

This is tracked by #24455.

BACKLOG, NO NEW ISSUES: SQL Optimizer automation moved this from Triage to Done Aug 5, 2020
@knz
Copy link
Contributor Author

knz commented Aug 5, 2020

I'm sorry you can't close an issue that's X-anchored-telemetry until the feature is complete and references to the issues in the code have been removed.

@knz knz reopened this Aug 5, 2020
BACKLOG, NO NEW ISSUES: SQL Optimizer automation moved this from Done to Triage Aug 5, 2020
@knz
Copy link
Contributor Author

knz commented Aug 5, 2020

Alternatively you can send a PR that changes the references in the code from #9682 to #24455 then close this. (But even then it feels icky given the refernces in previous versions)

@RaduBerinde RaduBerinde changed the title sql: implement computed indexes sql: support expression-based index columns Aug 5, 2020
@RaduBerinde RaduBerinde moved this from Triage to New features in BACKLOG, NO NEW ISSUES: SQL Optimizer Aug 5, 2020
@RaduBerinde
Copy link
Member

RaduBerinde commented Aug 5, 2020

Copying over comment some recent info from #24455 below.


I did a bit of research of what work would be necessary to support expression-based indexes:

  • opt/cat interface and sql descriptor changes - need a way to refer to expressions as index "columns"
  • represent scans of expression-based indexes in opt (need "fake" columnids for constraints)
  • opt operator statistics
  • exploration rules for scans - need to recognize instances of the index expressions in filters
  • mutation paths - need to plumb values for the expressions when indexes are updated
  • schema change / backfiller support

Some aspects would be simplified if we treated the expressions as virtual columns (computed but not stored) throughout the stack, but other aspects would become more complicated (it would be the first non-scannable table column - and the code around different kinds of table columns is already very complex).

Overall the amount of work is on the same order of magnitude as partial indexes.

@lopezator
Copy link
Contributor

@RaduBerinde any updates on this?

We need to apply an expression-based unique index to ensure unicity over a JSON field like, in PostgreSQL you could do:

CREATE UNIQUE INDEX test_data_foo_uq ON test ((data->>'foo'))

+info: https://dba.stackexchange.com/questions/161313/creating-a-unique-constraint-from-a-json-object/161345#161345

But in CRDB, until this is solved, we are planning to replicate the columns in the parent tablet by using a computed field:

CREATE TABLE test (
    id INT, 
    data JSONB, 
    foo STRING AS ((data->>'foo')) STORED
    );

Then apply the UQ over it:

CREATE UNIQUE INDEX test__foo__uq ON test(foo);

@mgartner
Copy link
Collaborator

@lopezator We've begun implementing expression-based indexes, but the feature won't be released until the 21.1 release at the earliest (Spring 2021). Until then, I think the workaround you mentioned is your best option.

@iameli
Copy link

iameli commented May 4, 2021

Checking in on this - looks like #57608 shipped late last year. Any plans to move forward with this? I think this is the only remaining issue prevent the Livepeer API server from using Cockroach.

@RaduBerinde
Copy link
Member

Yes, we have plans to support the syntax in 21.2. Note that in 21.1 (which is close to be released) we support virtual columns which allow you to do the same thing essentially.

@mgartner mgartner self-assigned this May 27, 2021
@mgartner mgartner added this to Triage in SQL Queries via automation May 27, 2021
@mgartner mgartner moved this from Triage to 21.2 in SQL Queries May 27, 2021
@jlinder jlinder added the T-sql-queries SQL Queries Team label Jun 16, 2021
@mgartner
Copy link
Collaborator

mgartner commented Aug 16, 2021

Expression indexes have landed on master and they will be released in version 21.2. I'll close the issue.

BACKLOG, NO NEW ISSUES: SQL Optimizer automation moved this from New features to Done Aug 16, 2021
SQL Queries automation moved this from 21.2 High Likelihood (90%) to Done Aug 16, 2021
@renevall
Copy link

Heads up, the following does not seem to work yet

 CREATE INDEX cache_02_17_2022_06_20_title_lower_idx ON cache_02_17_2022_06_20 (lower(title));

returns:

ERROR: version ExpressionIndexes must be finalized to use expression indexes

Seems it's related to 2bb519e#diff-606ca467041f6ba1f19e8ef8530d2976bf9b0db98477940ae9deb7b2fd5d1795L363 which would land on v22

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-encoding Relating to the SQL/KV encoding. A-sql-semantics C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-sql-queries SQL Queries Team X-anchored-telemetry The issue number is anchored by telemetry references.
Development

No branches or pull requests