Dev - Nexus
  1. Dev - Nexus
  2. NEXUS-3915

When artifacts are blocked if they fail content validation - no entry is made in the RSS feeds

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.8
    • Fix Version/s: 1.9.1
    • Component/s: System Feeds
    • Labels:
      None

      Description

      Our websense decided that some jars where adult content (Yay!) and nexus detected that the jar was not a jar.
      However nexus made the follwoign entry in the log files

      2010-11-04 17:42:11 INFO [tp-31596370-427] - o.s.n.i.c.MavenArch~ - Failed to parse Maven artifact /home/software/nexus/nexus-pro
      fessional-webapp-1.8.0/./../sonatype-work/nexus/storage/central/org/eclipse/text/3.3.0-v20070606-0010/text-3.3.0-v20070606-0010.jar due to
      error in opening zip file
      2010-11-04 17:42:11 INFO [tp-31596370-427] - o.s.n.i.c.MavenArch~ - Failed to parse Maven artifact /home/software/nexus/nexus-pro
      fessional-webapp-1.8.0/./../sonatype-work/nexus/storage/central/org/eclipse/text/3.3.0-v20070606-0010/text-3.3.0-v20070606-0010.jar due to
      error in opening zip file

      But there was no entry in any of the following feeds
      Error and Warning events
      Broken artifacts in all Nexus repositories (...)
      Broken files in all Nexus repositories (...)

      So if you are monitoring these feeds then everything looks healthy, and it is not until you resort to the logs that you see something is wrong.

        Issue Links

          Activity

          Hide
          James Nord added a comment -

          it is hard to go back to this exact scenario - but this is what the http request looks like from a browser.

          > GET http://www.playboy.com/sometest/foo.jar HTTP/1.1
          > Host: www.playboy.com
          > Proxy-Connection: keep-alive
          > Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
          > User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Chrome/9.0.597.84 Safari/534.13
          > Accept-Encoding: gzip,deflate,sdch
          > Accept-Language: en-GB,en;q=0.8
          > Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
          
          < HTTP/1.1 302 Moved Temporarily
          < Date: Wed, 09 Feb 2011 13:17:27 GMT
          < Proxy-Connection: close
          < Via: 1.1 localhost.localdomain
          < Location: http://xxx.xxx.xxx.xxx:yyyy/cgi-bin/blockpage.cgi?ws-session=123456789
          < Content-Length: 0
          
          
          > GET /cgi-bin/blockpage.cgi?ws-session=123456789 HTTP/1.1
          > Host: xxx.xxx.xxx.xxx:yyyy
          > Connection: keep-alive
          > Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
          > User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Chrome/9.0.597.84 Safari/534.13
          > Accept-Encoding: gzip,deflate,sdch
          > Accept-Language: en-GB,en;q=0.8
          > Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
          
          < HTTP/1.0 200 OK
          < Content-Length: 1650
          < Content-Type: text/html; charset=iso-8859-1
          < 
          ...
          
          Show
          James Nord added a comment - it is hard to go back to this exact scenario - but this is what the http request looks like from a browser. > GET http://www.playboy.com/sometest/foo.jar HTTP/1.1 > Host: www.playboy.com > Proxy-Connection: keep-alive > Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 > User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Chrome/9.0.597.84 Safari/534.13 > Accept-Encoding: gzip,deflate,sdch > Accept-Language: en-GB,en;q=0.8 > Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3 < HTTP/1.1 302 Moved Temporarily < Date: Wed, 09 Feb 2011 13:17:27 GMT < Proxy-Connection: close < Via: 1.1 localhost.localdomain < Location: http://xxx.xxx.xxx.xxx:yyyy/cgi-bin/blockpage.cgi?ws-session=123456789 < Content-Length: 0 > GET /cgi-bin/blockpage.cgi?ws-session=123456789 HTTP/1.1 > Host: xxx.xxx.xxx.xxx:yyyy > Connection: keep-alive > Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 > User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Chrome/9.0.597.84 Safari/534.13 > Accept-Encoding: gzip,deflate,sdch > Accept-Language: en-GB,en;q=0.8 > Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3 < HTTP/1.0 200 OK < Content-Length: 1650 < Content-Type: text/html; charset=iso-8859-1 < ...
          Hide
          Brian Demers added a comment -

          If I am reading the comments correctly. It seems that we just might be missing a feed entry when a file is blocked due to content validation?

          I think what everyone is confused about is that the file: /home/software/nexus/nexus-pro
          fessional-webapp-1.8.0/./../sonatype-work/nexus/storage/central/org/eclipse/text/3.3.0-v20070606-0010/text-3.3.0-v20070606-0010.jar is already in your system. And either passed content validation or was download before you enabled it ? You should be able to check the dates in the Artifact Info panel to verify this.

          Show
          Brian Demers added a comment - If I am reading the comments correctly. It seems that we just might be missing a feed entry when a file is blocked due to content validation? I think what everyone is confused about is that the file: /home/software/nexus/nexus-pro fessional-webapp-1.8.0/./../sonatype-work/nexus/storage/central/org/eclipse/text/3.3.0-v20070606-0010/text-3.3.0-v20070606-0010.jar is already in your system. And either passed content validation or was download before you enabled it ? You should be able to check the dates in the Artifact Info panel to verify this.
          Hide
          Tamás Cservenák added a comment -

          I think now it's clear what happens:

          • the Nexus' content validation did it's job, but it was quiet about doing it. You would like to have at least an entry in RSS feed. This was missed and fix is on the way.
          • The build probably did pull the artifact's POM successfully (websense did not trigger on it), but failed with JAR. Internally, the Indexer will try to look up the POM's corresponding JAR, but it does in clumsy way (NEXUS-3638)

          And the (indexer) logs pasted by James are actually not because the JARs is in cache but it's invalid JARs (a HTML), but it's because the JAR file is not there at all, it's only indexer "blindly" opening up (trying to open up the file) the file.

          Show
          Tamás Cservenák added a comment - I think now it's clear what happens: the Nexus' content validation did it's job , but it was quiet about doing it. You would like to have at least an entry in RSS feed. This was missed and fix is on the way. The build probably did pull the artifact's POM successfully (websense did not trigger on it), but failed with JAR. Internally, the Indexer will try to look up the POM's corresponding JAR, but it does in clumsy way ( NEXUS-3638 ) And the (indexer) logs pasted by James are actually not because the JARs is in cache but it's invalid JARs (a HTML), but it's because the JAR file is not there at all, it's only indexer "blindly" opening up (trying to open up the file) the file.
          Hide
          James Nord added a comment -

          the Nexus' content validation did it's job, but it was quiet about doing it. You would like to have at least an entry in RSS feed. This was missed and fix is on the way.

          Yes.

          The build probably did pull the artifact's POM successfully (websense did not trigger on it), but failed with JAR.

          I believe this was the case at the time.

          If I am reading the comments correctly. It seems that we just might be missing a feed entry when a file is blocked due to content validation?

          A feed entry and a log file entry (it appears I had mistaken the indexer log issue with a content validation log)

          I think what everyone is confused about is that the file: /home/software/nexus/nexus-pro
          fessional-webapp-1.8.0/./../sonatype-work/nexus/storage/central/org/eclipse/text/3.3.0-v20070606-0010/text-3.3.0-v20070606-0010.jar is already in your system. And either passed content validation or was download before you enabled it ? You should be able to check the dates in the Artifact Info panel to verify this.

          When I checked just now - the jar was not present (we used to expire proxied items - so I'm assuming this was cleaned up but the pom was not)
          After -re-requesting the jar I looked at the "Artifact Information"
          For the jar:
          Uploaded Date:
          Tue Nov 27 2007 07:14:42 GMT+0000 (GMT Standard Time)
          Last Modified:
          Tue Nov 27 2007 07:14:42 GMT+0000 (GMT Standard Time)

          For the pom:
          Uploaded Date:
          Tue Nov 27 2007 07:14:41 GMT+0000 (GMT Standard Time)
          Last Modified:
          Tue Nov 27 2007 07:14:41 GMT+0000 (GMT Standard Time)

          I'm not sure what that tells you/me apart from when it was uploaded to the remote(maven central and not when it was cached.

          Show
          James Nord added a comment - the Nexus' content validation did it's job, but it was quiet about doing it. You would like to have at least an entry in RSS feed. This was missed and fix is on the way. Yes. The build probably did pull the artifact's POM successfully (websense did not trigger on it), but failed with JAR. I believe this was the case at the time. If I am reading the comments correctly. It seems that we just might be missing a feed entry when a file is blocked due to content validation? A feed entry and a log file entry (it appears I had mistaken the indexer log issue with a content validation log) I think what everyone is confused about is that the file: /home/software/nexus/nexus-pro fessional-webapp-1.8.0/./../sonatype-work/nexus/storage/central/org/eclipse/text/3.3.0-v20070606-0010/text-3.3.0-v20070606-0010.jar is already in your system. And either passed content validation or was download before you enabled it ? You should be able to check the dates in the Artifact Info panel to verify this. When I checked just now - the jar was not present (we used to expire proxied items - so I'm assuming this was cleaned up but the pom was not) After -re-requesting the jar I looked at the "Artifact Information" For the jar: Uploaded Date: Tue Nov 27 2007 07:14:42 GMT+0000 (GMT Standard Time) Last Modified: Tue Nov 27 2007 07:14:42 GMT+0000 (GMT Standard Time) For the pom: Uploaded Date: Tue Nov 27 2007 07:14:41 GMT+0000 (GMT Standard Time) Last Modified: Tue Nov 27 2007 07:14:41 GMT+0000 (GMT Standard Time) I'm not sure what that tells you/me apart from when it was uploaded to the remote( maven central and not when it was cached.
          Hide
          Tamás Cservenák added a comment -

          Validated, RSS feed does contains the entry about artifact not passing content validation.

          On a side-note: Maven Indexer got also some enhancements, and will not produce misleading logs like those in this issue (before, it "blindly" tried to open the JAR not present, misleading us to believe that JAR was downloaded, but actually it was not present in local cache).

          Tested using teaser servlet

          https://github.com/cstamas/teaser

          Just create a proxy against /echo resource (it accepts and processes any path below /echo by just dumping the response as text/plain), it triggered content validation (I requested a POM) and request was banned, no POM (or plaintext response) was cached, and also RSS entry was created "...the artifact /log4j/log4j/1.2.13/log4j-1.2.13.pom content is invalid in repository echo-proxy!".

          Show
          Tamás Cservenák added a comment - Validated, RSS feed does contains the entry about artifact not passing content validation. On a side-note: Maven Indexer got also some enhancements, and will not produce misleading logs like those in this issue (before, it "blindly" tried to open the JAR not present, misleading us to believe that JAR was downloaded, but actually it was not present in local cache). Tested using teaser servlet https://github.com/cstamas/teaser Just create a proxy against /echo resource (it accepts and processes any path below /echo by just dumping the response as text/plain), it triggered content validation (I requested a POM) and request was banned, no POM (or plaintext response) was cached, and also RSS entry was created "...the artifact /log4j/log4j/1.2.13/log4j-1.2.13.pom content is invalid in repository echo-proxy!".

            People

            • Assignee:
              Tamás Cservenák
              Reporter:
              James Nord
              Last Updated By:
              Jason Dillon
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Date of First Response: