Welcome to the #dominoforever Product Ideas Forum! The place where you can submit product ideas and enhancement request. We encourage you to participate by voting on, commenting on, and creating new ideas. All new ideas will be evaluated by HCL Product Management & Engineering teams, and the next steps will be communicated. While not all submitted ideas will be executed upon, community feedback will play a key role in influencing which ideas are and when they will be implemented.
For more information and upcoming events around #dominoforever, please visit our Destination Domino Page
It's not about agents stopped not nicely. Servers are hard killed.
That's a different concern then an agent that is stopped. The stop of a server would also need enhancements.
Currently the "quit" hits all servertasks and the core services at once.
Code that does a search operation will get an error message.
Here is what would make sense:
There should be a new internal state which tasks could check in future and do a cooperative shutdown.
Each task should monitor the new shutdown trigger and stop after each operation to shutdown cleanly.
The shutdown would be a bit like a drain on K8s. A cooperative shutdown in two phases.
1. Signal all services we are stopping the server soon to give each task a chance to stop cleanly
2. After some configurable time provide the quit signal like today.
this would stop servers more cleanly. But it would still not be guaranteed. some long going operations should be still terminated -- else the server would never stop.
The Windows PRE_SHUTDOWN is a different story. The Domino processes are hard killed.
You should always have Translog enabled to try to protect against database corruptions in case of a crash.
But a Windows should down usually can way a minute or two until the Domino server is shutdown.
that's what this AHA idea is about. the good news is that the feature has been implemented in 14.5.1 as I did hear.
I just proved it again. Domino running as a service, Logon to the Windows Server OS, Choose to 'restart' Windows. It restarts quickly. Review what happened: Domino crashes (does not complete writing to log.nsf). Other crash info in the Domino directories. I could not find much logging in the Windows event viewer where it tried to shutdown the HCL Domion service. This has caused corruption issues in NSF files (recently in admin4.nsf). If there are agents running via Amgr they will not be stopped nicely, they will just crash. It could be in the middle of a document update. When someone restarts the server OS or if an organization has automatic Windows updates and restarts running, HCL must do whatever it can to help Windows shutdown Domino nicely. Many organizations don't pay attention to their Domino servers. Too many org simply put them on an automatic Windows update schedule. They don't realize that this causes issues with Domino data. They don't treat the Domino server any differently than a file server.
This generic should down helper would detect when Windows shuts down and wait for the configured services to be securely ended before rebooting or shutting down: https://github.com/nashcom/nsh-tools/tree/main/nshshutdown Sprunki
This generic should down helper would detect when Windows shuts down and wait for the configured services to be securely ended before rebooting or shutting down: https://github.com/nashcom/nsh-tools/tree/main/nshshutdown Sprunki
A hanging server which needs restart is a different situation.
This idea is about ensuring the Domino server is cleanly shutdown when you reboot or shutdown Windows without first stopping Domino.
Here is a generic should down helper which would catch the shutdown of Windows and delay the reboot/shutdown until the configured services have been safely stopped:
https://github.com/nashcom/nsh-tools/tree/main/nshshutdown
This is just a work-around. IMHO the Domino service should natively support "SERVICE_CONTROL_PRESHUTDOWN".
Never looked into the Microsoft code details. However, in case it is not garanteed that the server will resatart even if a Domino Process is hanging for a longer time, I don't like this
(too many only remote accessed servers, a longer timout might be a good thing, soo).