A Sametime Chat Mystery

Today I was contacted urgently by a site I did an install for back in early September.  The install went well and I left them several months ago with working components, but apparently about a week ago people stopped being able to login to the Community server. In fact not even the SSC could access it.

.. and yet no-one had changed anything at all.  I do love a good mystery so I thought it would be useful to someone (or even just future Gab) to document what I did:

  • verified if port 1533 was listening using netstat -an |find /i “1533”.
  • verified there were no running AV services that could interfere with the ports.
  • checked if the ST services were running, in fact only about 6 were.
  • tried to start some of the services that weren’t running and they failed immediately.
  • since no-one touched Sametime my next guess was a Windows update that caused a problem.
  • checked the Windows networking settings hadn’t been overwritten (they had) . Although those settings shouldn’t cause the services to fail completely it was worth resetting them.
  • I then added vp_trace_all=1 to the [Debug] settings in the sametime.ini which creates detailed log files in the \ibm\domino\trace directory.
  • having added that I could see log files being created for every service, even the ones that wouldn’t stay started. In fact those ones recreated every couple of minutes.  So the services were trying to start and failing.
  • reviewing the log files I could see on things like STPlaces there was a JVM error, but I put that aside for the time being in case it was a dependency issue.
  • in other logs such as STDirectory I could see broken networking errors and just before that I could see a comment about switching to TLS.

    A-ha! Well, that’s new.

  • checking the sametime.ini I found:
    VPS_PORT=1516
    VPS_TLS_PORT=1516

    which I changed to:
    VPS_PORT=1516
    #VPS_TLS_PORT=1516

    My guess being an incomplete TLS configuration from the SSC.  Having done that the server restarted perfectly and all services started.  The SSC could then access the server with no problem.

Of course once I had spent 4hrs doing that I then found a technote on it which I never would have found before I saw the TLS entry.  Here’s the technote .

Sometimes it’s a rollercoaster but so long as I get things working  I’m calling that a good day.  Now back to building more Connections servers.