Uploaded image for project: 'Dev - Nexus Repo'
  1. Dev - Nexus Repo
  2. NEXUS-12844

Upgrade Apache Tika dependency to 1.14

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Done
    • Affects Version/s: 3.3.0
    • Fix Version/s: 3.4.0
    • Component/s: Maven
    • Labels:
      None
    • Story Points:
      1
    • Epic Link:
    • Sprint:
      Formats/Core Team - Sprint 92, Formats/Core Team - Sprint 93

      Description

      As part of investigating 10087 I found there was a slight improvement in the behaviour of Tika: 

      1.14 still wrongly identifies as text/html if the xml contains any tag starting with <html.., however if you have a comment at the start of your xml file, like the pom in question (Note: works even if it's empty <!-- -->) then it correctly identifies as text/xml.

       

      To test:

      1. Create a proxy repo called "RSO" and point it to https://repository.sonatype.org/service/local/repositories/sonatype-internal/content/
      2. Request http://localhost:8081/repository/RSO/com/sonatype/insight/ci/insight-ci-parent/2.14.4/insight-ci-parent-2.14.4.pom 
      3. You should be presented with a POM rather than a 404.

       

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              jtom Joe Tom
              Reporter:
              jstephens Joseph Stephens
              Last Updated By:
              Peter Lynch
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Date of First Response:

                  tigCommentSecurity.panel-title