Issue Details (XML | Word | Printable)

Key: HUDSON-3273
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: ashlux
Reporter: peterandersen
Votes: 19
Watchers: 14
Operations

If you were logged in you would be able to see more operations.
Hudson

Maven build aborted before execution due to SocketTimeoutException

Created: 11/Mar/09 08:46 AM   Updated: 20/May/10 02:56 AM   Resolved: 04/May/10 09:12 PM
Return to search
Component/s: maven2
Affects Version/s: current
Fix Version/s: None

Environment: Platform: All, OS: Linux
Issue Links:
Dependency
 


 Description  « Hide

Hi

I have a build server that sometime abort the scheduled build with just error
aborted - else no error messages, see example build log below.

I have configured a logger for the fine level (logger: hudson, log level: fine)
see catelina log below - no useful errors

I have just start calling maven with -X for debug output - if maven was the
course - but have not yet capture the error (if maven is to blame - don't think so)

Any suggestions - please see logs are version data below

Best regards

Peter Andersen

------------

Versions:
Hudson: 1.289
Tomcat: 6.0.18
java version "1.6.0_0"
OpenJDK Runtime Environment (build 1.6.0_0-b11)
OpenJDK 64-Bit Server VM (build 1.6.0_0-b11, mixed mode)
Maven: 2.0.9
OS Debian 5, uname -a:
Linux hudson 2.6.26-1-amd64 #1 SMP Sat Jan 10 17:57:00 UTC 2009 x86_64 GNU/Linux

Example aborted build:
started
Updating svn://svn.lenio.dk/xxxx/trunk
U ...
A build.xml
At revision 11963
Parsing POMs
[trunk] $ /usr/lib/jvm/java-6-openjdk/bin/java -Xmx768m -XX:MaxPermSize=256m -cp
/home/sysadm/apache-tomcat-6.0.18/webapps/hudson/WEB-INF/lib/maven-agent-1.289.jar:/usr/share/maven2/boot/classworlds.jar
hudson.maven.agent.Main /usr/share/maven2
/home/sysadm/apache-tomcat-6.0.18/webapps/hudson/WEB-INF/lib/remoting-1.289.jar
/home/sysadm/apache-tomcat-6.0.18/webapps/hudson/WEB-INF/lib/maven-interceptor-1.289.jar
38194
ERROR: Aborted
No emails were triggered.
finished: ABORTED

Catalina log snippet
Mar 11, 2009 1:33:35 AM hudson.model.Queue maintain
FINE: Queue maintenance started hudson.model.Queue@338f7701
Mar 11, 2009 1:33:40 AM hudson.model.Queue maintain
FINE: Queue maintenance started hudson.model.Queue@338f7701
Mar 11, 2009 1:33:45 AM hudson.model.Queue maintain
FINE: Queue maintenance started hudson.model.Queue@338f7701
Mar 11, 2009 1:33:48 AM hudson.model.Run setResult
FINE: diop-site-build #12 : result is set to ABORTED by
hudson.model.Run.run(Run.java:896)
Mar 11, 2009 1:33:48 AM hudson.model.Run run
INFO: diop-site-build #12 main build action completed: ABORTED



Sort Order: Ascending order - Click to sort in descending order
sdempsay added a comment - 20/May/09 04:21 PM

I am seeing this when trying to set up a new slave on 1-306. The host I had
been using to slave is still working, but when it tries to run the new slave I
get:

Started by user rsaska
Building remotely on dev-blade-3
Updating http://s***/Connect/trunk/java-infrastructure
At revision 234
no change for http://***/Connect/trunk/java-infrastructure since the previous
build
Parsing POMs
[.] $ java Xmx1024M -cp /home/hudson/maven-agent.jar:/usr/share/maven
2.0.10/boot/classworlds-1.1.jar hudson.maven.agent.Main /usr/share/maven-2.0.10
/home/hudson/slave.jar /home/hudson/maven-interceptor.jar 56107
ERROR: Aborted
Skipping Cobertura coverage report as build was not successful...
Finished: ABORTED


gj added a comment - 03/Sep/09 09:21 AM

I also seem to be encountering this issue with H1.318:
A SCM change trigger started this job
Updating http://.../trunk
U pom.xml
At revision 27567
Parsing POMs
[someproject] $ java cp /usr/local/hudson/data/plugins/maven-plugin/WEB-INF/lib/maven-agent
1.318.jar:/usr/local/hudson/maven/apache-maven-2.0.10/boot/classworlds-1.1.jar
hudson.maven.agent.Main /usr/local/hudson/maven/apache-maven-2.0.10
/Library/Tomcat/work/Catalina/hudson.---.com/_/WEB-INF/lib/remoting-1.318.jar
/usr/local/hudson/data/plugins/maven-plugin/WEB-INF/lib/maven-interceptor-1.318.jar 50328
ERROR: Aborted
Finished: ABORTED


gj added a comment - 14/Sep/09 07:48 AM

Still getting the same symptoms and behaviour with 1.323


blackmuzzle added a comment - 16/Sep/09 01:14 AM

Same issue here: Large maven2 project, builds fine on command line, builds fine
in Hudson - but occasionally the build aborts right after invoking the maven plugin:

...
At revision 4765
no change for http://redacted/trunk since the previous build
Parsing POMs
[southcom] $ /root/.hudson/tools//bin/java -cp
/root/.hudson/plugins/maven-plugin/WEB-INF/lib/maven-agent-1.322.jar:/opt/apache-maven-2.1.0/boot/classworlds-1.1.jar
hudson.maven.agent.Main /opt/apache-maven-2.1.0
/opt/apache-tomcat-6.0.18/webapps/hudson/WEB-INF/lib/remoting-1.322.jar
/root/.hudson/plugins/maven-plugin/WEB-INF/lib/maven-interceptor-1.322.jar 34633
/root/.hudson/plugins/maven-plugin/WEB-INF/lib/maven2.1-interceptor-1.2.jar
ERROR: Aborted

Simply retrying works, eventually. Some days it's hard to get hudson to build,
other days we don't have a single "aborted" build.

Oh and yes, access control is enabled and nobody pressed "abort".


abayer added a comment - 16/Sep/09 01:46 PM

The problem here seems to be that the channel never really gets started properly

  • Hudson never even has the chance to run the Maven process. I'm not sure what's
    happening to the channel, though.

abayer added a comment - 28/Oct/09 11:50 AM

Taking issue - working on reproducing and improving logging.


scm_issue_link added a comment - 28/Oct/09 01:29 PM

Code changed in hudson
User: : abayer
Path:
trunk/hudson/main/maven-plugin/src/main/java/hudson/maven/MavenModuleSetBuild.java
http://fisheye4.cenqua.com/changelog/hudson/?cs=23288
Log:
Improving logging for debugging HUDSON-3273


abayer added a comment - 30/Oct/09 01:23 PM

We've been able to reproduce this at least once, with improved debugging:

ERROR: Aborted Maven execution for InterruptedIOException
java.net.SocketTimeoutException: Accept timed out
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
at java.net.ServerSocket.implAccept(ServerSocket.java:450)
at java.net.ServerSocket.accept(ServerSocket.java:421)
at
hudson.maven.MavenProcessFactory$SocketHandler$AcceptorImpl.accept(MavenProcessFactory.java:167)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at
hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:274)
at
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:255)
at
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:215)
at hudson.remoting.UserRequest.perform(UserRequest.java:104)
at hudson.remoting.UserRequest.perform(UserRequest.java:48)
at hudson.remoting.Request$2.run(Request.java:244)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
at java.util.concurrent.FutureTask.run(FutureTask.java:123)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
at java.lang.Thread.run(Thread.java:595)
Finished: ABORTED


abayer added a comment - 30/Oct/09 01:33 PM

Clarifying summary


alacenaire added a comment - 05/Jan/10 05:52 AM

Hi,

I have the same error with Hudson 1.339.

It happens after a SVN checkout of a Maven project, when parsing the pom :

At revision 6651
no change for http://BLABLA/svn/BLABLA/trunk since the previous build
Parsing POMs
[BLABLA] $ c:\jdk1.6.0_17/bin/java -Xms512m -Xmx1024m -cp E:\hudson\.hudson\plugins\maven-plugin\WEB-INF\lib\maven-agent-1.339.jar;c:\apache-maven-2.2.1\boot\classworlds-1.1.jar hudson.maven.agent.Main c:\apache-maven-2.2.1\ C:\Tomcat_CI\temp\0-hudson\WEB-INF\lib\remoting-1.339.jar E:\hudson\.hudson\plugins\maven-plugin\WEB-INF\lib\maven-interceptor-1.339.jar 2417 E:\hudson\.hudson\plugins\maven-plugin\WEB-INF\lib\maven2.1-interceptor-1.2.jar
ERROR: Aborted Maven execution for InterruptedIOException
java.net.SocketTimeoutException: Accept timed out
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(Unknown Source)
at java.net.ServerSocket.implAccept(Unknown Source)
at java.net.ServerSocket.accept(Unknown Source)
at hudson.maven.MavenProcessFactory$SocketHandler$AcceptorImpl.accept(MavenProcessFactory.java:167)
at hudson.maven.MavenProcessFactory.newProcess(MavenProcessFactory.java:202)
at hudson.maven.ProcessCache.get(ProcessCache.java:231)
at hudson.maven.MavenModuleSetBuild$RunnerImpl.doRun(MavenModuleSetBuild.java:455)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:417)
at hudson.model.Run.run(Run.java:1179)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:304)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Finished: ABORTED


mbeveridge added a comment - 20/Jan/10 01:18 AM

I get the same thing also on Hudson 1.341, again after SVN checkout:

At revision 12051
no change for http://subversion/repos/project/trunk since the previous build
Parsing POMs
[project] $ "C:\Documents and Settings\Default User\.hudson\tools\/bin/java" -Xms256m -Xmx512m -cp "C:\Documents and Settings\Default User\.hudson\plugins\maven-plugin\WEB-INF\lib\maven-agent-1.341.jar;C:\Documents and Settings\Default User\.hudson\tools\maven2\boot\classworlds-1.1.jar" hudson.maven.agent.Main "C:\Documents and Settings\Default User\.hudson\tools\maven2" C:\tomcat\webapps\hudson\WEB-INF\lib\remoting-1.341.jar "C:\Documents and Settings\Default User\.hudson\plugins\maven-plugin\WEB-INF\lib\maven-interceptor-1.341.jar" 2560 "C:\Documents and Settings\Default User\.hudson\plugins\maven-plugin\WEB-INF\lib\maven2.1-interceptor-1.2.jar"
ERROR: Aborted Maven execution for InterruptedIOException
java.net.SocketTimeoutException: Accept timed out
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)
at java.net.ServerSocket.implAccept(ServerSocket.java:453)
at java.net.ServerSocket.accept(ServerSocket.java:421)
at hudson.maven.MavenProcessFactory$SocketHandler$AcceptorImpl.accept(MavenProcessFactory.java:167)
at hudson.maven.MavenProcessFactory.newProcess(MavenProcessFactory.java:202)
at hudson.maven.ProcessCache.get(ProcessCache.java:231)
at hudson.maven.MavenModuleSetBuild$RunnerImpl.doRun(MavenModuleSetBuild.java:455)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1193)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:304)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:122)
Skipping Cobertura coverage report as build was not UNSTABLE or better ...
Finished: ABORTED

After re-running the build a couple of times it works, but it means that we have to do manual builds rather than using SCM polling or scheduled builds...


abayer added a comment - 13/Feb/10 06:15 AM

Question for people seeing this - what ports are you seeing this error show up with vs those times you can build successfully? The port is the numeric second-to-last argument in the Java call right before the error. The ServerSocket is getting opened on that port, and we'd be seeing a different exception, thrown before we even get to that java call, if that binding failed. So I'm wondering if there may be something blocking the master from getting to that port on the slave.


mbeveridge added a comment - 16/Feb/10 04:30 AM

There doesn't seem to be much connection between port number and the error as far as I can see. The last few builds I have done each used different port numbers but whether it failed or not tended to be associated with whether it was some time (e.g. a few hours) since hudson last did a build, as below:

1) ran build at start of week for project A: used port 4377 and aborted
2) re-ran build for project A immediately: used port 4584 and succeeded
3) ran build after that for project B: used port 4629 and aborted
4) re-ran build for project B immediately: used port 4722 and succeeded
5) subsequently ran builds at regular short intervals for project B:

  • port 2926, success
  • port 3439, success
  • port 4026, success
    6) after a couple of hours, ran a new build for project B: used port 2698 and aborted
    7) after another hour, ran a new build for porject B: used port 1611 and aborted
    8) re-ran build for project B immediately: used port 1674 and succeeded

Generally I find that once it has started working it will work fairly regularly but if no build happens for a few hours then the next build will abort and re-running it makes it work.

By the way, I am running on a master-only configuration, no slave machines configured.


skybird added a comment - 11/Mar/10 01:32 AM

Hi,

in 1.327 it was already present, an Upgrade to 1.348 only added the "ERROR: Aborted Maven execution for InterruptedIOException" at the bottom of the job.
I am running a master-only config using W2k inside Tomcat 5.5.
I tried to reduce the number of build Executors from 2 to 1 but this didn't change anything.
I added a daily service restart of Tomcat but no change.

The maven call I see is the following:
[workspace] $ c:\EProg\jdk1.5.0_22/bin/java -Xmx256M -cp D:\plugins\maven-plugin\WEB-INF\lib\maven-agent-1.348.jar;D:\jobs\autobuild_init_maven\workspace\maven-2.2.1\boot\classworlds-1.1.jar hudson.maven.agent.Main D:\jobs\autobuild_init_maven\workspace\maven-2.2.1\ C:\EProg\Tomcat\webapps\hudson\WEB-INF\lib\remoting-1.348.jar D:\plugins\maven-plugin\WEB-INF\lib\maven-interceptor-1.348.jar 1990 D:\plugins\maven-plugin\WEB-INF\lib\maven2.1-interceptor-1.2.jar

As mbeveridge mentioned, once it startet sucessfull there are no more problems.

My Ports:
Success: 1340 2249 4411 1977 3190 4749 1538
Fail:1990 3585 3988

It happens even on big and small maven builds.


Ricket added a comment - 14/Mar/10 12:27 PM

I'm seeing the exact same thing. I have a really tiny Maven build; I'm just trying to get everything working correctly before I import my code, so I have a simple "Hello, world" program. This is very discouraging, to have spent a lot of time setting up Hudson only to find that half the time it can't run Maven correctly...


ashlux added a comment - 15/Mar/10 08:17 PM

Ricket: Are you really seeing the issue /very/ frequently? I only see it very rarely. Could you post more information about your environment (version of Hudson, Maven, Java, memory, cpu, OS, etc)? If you're really getting it more frequently than what I've seen, maybe this will help. Of course I'm just taking a stab in the dark.


Ricket added a comment - 15/Mar/10 09:06 PM

The issue is really just Hudson not waiting long enough before connecting, when the CPU is bogged down. I noticed my server was getting really slow so I shut down another service I was running (Artifactory) which was eating a lot of RAM and CPU, and now Hudson hasn't had a problem; the CPU and RAM have been freed so that the job is fast enough to start the Maven interceptor in time for Hudson to connect to it.

By the way, I have an old slow dedicated server (AMD 1.5ghz, 256MB DDR RAM) and am running a number of things on it (Apache, MySQL, Subversion via apache, Hudson, Sonar, Artifactory not anymore). So that's why the server was too slow before. I'm just glad Hudson is pretty lightweight, at least compared to Sonar.


ashlux added a comment - 15/Mar/10 09:38 PM

Thanks for the response Ricket.

I have been suspecting the timeout isn't set high enough; however timeout is set to 10 seconds. Surely that would be enough time? On the other hand I have no idea how much work actually takes place in that time, so 10 seconds might not be long enough.

IMO there's no harm in increasing the timeout to cut down on the number of false build failures. Perhaps a 30 second timeout would do the trick? The only reason that wouldn't work is with severely overloaded machines or if there's another issue with the Maven plugin.


Ricket added a comment - 15/Mar/10 09:44 PM

Yes, it would be good for the timeout to be higher. Apparently 10 seconds isn't enough for a bogged-down server, and I suspect I would see this problem again if I were to try and run two builds at once. And yes, like you said, I see no reason a higher timeout would cause any problems, except it would mean waiting an extra 20 seconds if someone were trying to debug a problem with Maven or something.


mbeveridge added a comment - 16/Mar/10 02:24 AM

Probably that also explains why it tends to fail on the first build after a period of inactivity but once it gets going they seem to work ok. The first build presumably causes lots of things to be loaded and cached etc so can timeout, but once they are cached subsequent builds are ok.

Is there a property that can be set to change the timeout for hudson to connect to the maven interceptor, or is that hardcoded?


ashlux added a comment - 16/Mar/10 06:03 AM

mbeveridge: the timeout is hardcoded at 10 seconds.


Ricket added a comment - 16/Mar/10 06:53 AM

mbeveridge: My thoughts exactly. With only 256mb, my server was probably well out of RAM and into its 4gb page file (well, I know it to be so; I ran free and top, and even now, after stopping Artifactory, it is about 100mb into the page file with full RAM usage), and so when the Maven interceptor was started, those things were brought from the page file into RAM, and resided there long enough for the successive builds to be successful. That's my theory anyway.


gonzalad added a comment - 23/Mar/10 12:29 AM

Hello,
Thanks for the tip.
I've set the socket timeout to 30s and recompiled MavenProcessFactory class in my environment (http://fisheye.hudson-ci.org/browse/Hudson/trunk/hudson/main/maven-plugin/src/main/java/hudson/maven/MavenProcessFactory.java?r=26722#l163).
Things are better now.
SocketTimeoutException occured nevertheless once.
Before this change, SocketTimeoutException occured more or less 1 time every 3 launches.

Didn't test it enough though for the moment.


gonzalad added a comment - 29/Mar/10 01:09 AM

After one week of usage with 30s socket timeout), I confirm it works definitely better but it's not 100% fool proof : more or less 9 launches out of 10 are ok.
I get from time to time SocketTimeoutException.

From my point of view it would be great to be able to change the socket timeout from the hudson admin console.


Ricket added a comment - 29/Mar/10 04:41 AM

gonzalad: What system are you running Hudson on? How much RAM does it have? Is it possible that you could stop some other service and see if it works more reliably?

As you probably read above, I was getting timeouts almost every time, and now after stopping the Artifactory service, I haven't tested it a whole lot but it worked every time I tested. Artifactory was using hundreds of MBs of RAM and swap space and some CPU time, but now that I cut it out, Maven is able to start fast enough to connect to Hudson.


persyval added a comment - 29/Mar/10 05:31 AM

We have the same issue regarding the timeout.
Our buildserver which runs Hudson resides on a virtual machine on a server that has multiple virtual instances.
The first three or four build runs on a day have this timeout error.

After that it seems the virtual machine has allocated enough memory or cpu for itself that the problem goes away.
so it definitely seems to be associated with poor machine resources.
Would be nice to have an optional parameter to enter the timeout value though.


gonzalad added a comment - 30/Mar/10 09:51 AM

Sorry Ricket for the late answer,
My server is running on Windows XP SP3 with MacAfee (Mcshield.exe).
2.33 GHz, 2Go RAM Core 2Duo E6550.
1.17 Go Used (without Hudson, I needed to stop Hudson today for another pb ;( ).
Running Hudson, Trac (with apache), Sonar (this one doing nothing) and MySql (for Sonar).
I cannot use Hudson for one week on this workstation. I'll monitor better next week. Sorry.


scm_issue_link added a comment - 04/May/10 09:12 PM

Code changed in hudson
User: : ashlux
Path:
trunk/hudson/main/maven-plugin/src/main/java/hudson/maven/MavenProcessFactory.java
http://hudson-ci.org/commit/30733
Log:
[FIXED HUDSON-3273] Increase timeout to 30 seconds since 10 seconds isn't enough for machines with poor resources.


skybird added a comment - 20/May/10 02:56 AM

Thank you very much for the fix, I can confirm since I upgraded to 1.357 the error is gone.