Uploaded image for project: 'Dev - Nexus Repo'
  1. Dev - Nexus Repo
  2. NEXUS-8884

Most search fields have no partial matching ability

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-m4, 3.0.0-m5, 3.0.0-m6, 3.0.0-m7, 3.0.0, 3.1.0, 3.2.0, 3.2.1, 3.4.0, 3.5.1, 3.6.0, 3.7.0, 3.10.0, 3.12.1
    • Fix Version/s: 3.14.0
    • Component/s: Documentation, Search, UX
    • Environment:
      Chrome Windows7, Chrome MacOSX

      Description

      There are two issues with search fields that make them hard to use.

      1. Most fields don’t partial match (e.g. searching for “aether” in the group field will not match “org.eclipse.aether”)
      2. Fields that do wildcard only match tokens (e.g. searching for “aether” in the keyword field will match “aether-api”, but “aeth” will not match “aether-api”)

      Acceptance criteria:

      • All fields will behave the same as the keyword field ("", *, booleans, etc are all added explicitly if the user wants them)

      Technical Notes

      • This approach may require a reindex. We should either do this automatically on upgrade, or provide a way for folks to reindex if they want the new functionality (the caveat is that this could cause support issues, since support will need to figure out if poor quality search results are stemming from the fact that people haven't reindexed yet).
      • Many of the fields that users can search are marked as (string, not_analyzed) which means that to do partial matching we need to use a wildcard query.
      • Another approach would be to do a re-index and use an ngram analyzer to get partial matching, but the first approach is simpler (if slower).

      The query would look something like this (not valid ES, just pseudo-code to show the structure):

      {
        "bool": {
          "filter": { "term": { "field": "format", "value": "maven2" } },
          "must": [{
            "bool": {
              "should": {
                "term": { "field": "somefield", "value": "somevalue" },
                "wildcard": { "field": "somefield", "value": "*somevalue*" }
              }
           }, {
             ...
           }]
        }
      }

      We should also probably switch to a pure term filter if the user quotes the string so they have a way to escape partial matching.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              elijahel Elijah El-Haddad
              Reporter:
              jtom Joe Tom
              Last Updated By:
              Peter Lynch
              Team:
              Nexus - UX
              Votes:
              19 Vote for this issue
              Watchers:
              25 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Date of First Response:

                  tigCommentSecurity.panel-title