Skip to Main Content
HCL Domino Ideas Portal

Welcome to the #dominoforever Product Ideas Forum! The place where you can submit product ideas and enhancement request. We encourage you to participate by voting on, commenting on, and creating new ideas. All new ideas will be evaluated by HCL Product Management & Engineering teams, and the next steps will be communicated. While not all submitted ideas will be executed upon, community feedback will play a key role in influencing which ideas are and when they will be implemented.

For more information and upcoming events around #dominoforever, please visit our Destination Domino Page

Status Assessment
Workspace Domino
Categories Administration
Created by Guest
Created on Dec 8, 2020
Merged idea
This idea has been merged into another idea. To comment or vote on this idea, please visit DOMINO-I-95 Add possibility to run agent on cluster server(s).

Provide failover for schedule agent execution on Domino Cluster Merged

In current every release of Domino server (9.0.x/10.0.x/11.0.x), when failover occurs on Domino Cluster, schedule agent can't not be automatically switched to be executed on failover target server(secondary server), it can only be executed on the original primary server.

The customer wish HCL could provide the following new functions to support failover for schedule agent execution on Domino Cluster

1) provide some extra setting to select the secondary server where schedule agent can execute after failover.

2) Also provide API which can select the secondary server where schedule agent can execute after failover.

  • Admin
    Thomas Hampel
    Reply
    |
    Jan 18, 2021

    Admins do not always know why an agent needs to run, or where it is supposed to run.

    Even today developers, who want to have high availability for agents implemented, can do so by self defining how this is supposed to work.

    e.g. the agent scheduled on server#2 can check periodically if it is possible to open the application on #server1. Or you can call a low level OS method to 'ping' the other server for checking if it still is alive.

    However, all business logic and repl. conflict prevention is to be managed by the developer.

  • Guest
    Reply
    |
    Dec 24, 2020

    The idea of ​​a second server is bad - both servers may crash. Which server in the cluster should the agents run on then?

    But there is a way to avoid conflict.

    1. In the configuration document, change the processing server to a working server manually (administrator).

    2. Change server launch after crash (HCL) - do it without running Amgr and RunJava. And automatically start them only after the configuration database has been replicated.

  • Guest
    Reply
    |
    Dec 13, 2020

    @ThomasHampel: this is absolutely right... but there is no way to find a solution for clustered agents that does NOT failin a szenario where network connection between servers is lost. How should the failover server determine, if the other server was completely shutdown via Power Switch or if only the network connection is down... all solutions -even if included in server core- need to check if the other server is able to run the agent... best regards, Torsten

  • Admin
    Thomas Hampel
    Reply
    |
    Dec 13, 2020

    In this example, disconnecting the network cable, or a simple routing issue between the cluster members will lead to agents running on both servers at the same time because server2 will assume that server1 is down while server1 still runs but without a network connection to its cluster partner(s)

  • Guest
    Reply
    |
    Dec 8, 2020

    I agree, that this would be very useful. I work with a configuration document and a special check for this purpose: Agent is scheduled to run on "all servers". The configuration document contains the server the agent is currently meant to run. As soon as the agent starts, it checks, if server = configured server. If it is not, then it tries to reach the configured server and open the database there. If it opens, then agent stops. If it does NOT open, then target server must be down or database corrupt, so the agent writes its own servername in the configuration document and is the "configured server" from that moment on, until it goes down and the other agent takes over (or an admin manually changes back the server in the config document).