Cluster Upgrade from 4.44 to 4.45¶
This guide will lead you through the steps specific for upgrading a NetEye Cluster installation from version 4.44 to 4.45.
During the upgrade, individual nodes will be put into standby mode. Thus, overall performance will be degraded until the upgrade is completed and all nodes are revoked from standby mode. Granted the environment connectivity is seamless, the upgrade procedure may take up to 30 minutes per node.
Warning
Remember that you must upgrade sequentially without skipping versions, therefore an upgrade to 4.45 is possible only from 4.44; for example, if you have version 4.27, you must first upgrade to the 4.28, then 4.29, and so on.
Breaking Changes¶
Ansible 2.16¶
Ansible has been upgraded to version 2.16. If you have any custom Ansible playbooks, please ensure they are compatible with this version. Refer to the following official release notes for more information:
IcingaDB Web¶
Since this version of NetEye, the monitoring module has been replaced by IcingaDB Web. This leads to an improved user interface for various pages like Overview, Hosts, Services, Problems and more. The historical data, under the hood, is still saved and loaded from IDO as before and only in next releases will be possible to migrate historical data to IcingaDB and consequently disable IDO.
Due to these changes, the following breaking changes apply:
IcingaDB and monitoring roles¶
The restrictions and permissions for the monitoring objects are applied in a slightly different way in IcingaDB.
Example 1¶
Assume to have the following roles configured:
Role A
Restrictions: Host group = “GroupA”
Permissions: General module access
Role B
Restrictions: Host group = “GroupB”
Permissions: General module access + Downtime management
Assume the two roles are not inherited by each other and they are assigned to user Bob.
Old behavior (monitoring module):
Bob can see hosts from both GroupA and GroupB in the Hosts list.
Bob can schedule downtimes for GroupA and GroupB hosts.
New behavior (IcingaDB Web):
Bob can see hosts from both GroupA and GroupB in the Hosts list.
Bob can schedule downtimes only for GroupB hosts.
Example 2¶
Assume to have the following roles configured:
Role A
Restrictions: Host group = “GroupA”
Permissions: General module access
Role B
Permissions: General module access + Downtime management
Inherits Role A
User Alice is assigned to Role B.
The behavior is not changed between monitoring module and IcingaDB Web:
Alice can see hosts from GroupA.
Alice can schedule downtimes for GroupA hosts.
Example 3¶
Assume to have the following roles configured:
Role A
Restrictions: Host group = “GroupA”
Permissions: General module access
Role B
Restrictions: Host group = “GroupB”
Permissions: General module access
Role C
Permissions: General module access + Downtime management
User Charlie is assigned to Role A, Role B and Role C.
The behavior is not changed between monitoring module and IcingaDB Web:
Charlie can see hosts from both GroupA and GroupB in the Hosts list.
Charlie can schedule downtimes for GroupA and GroupB hosts.
For more information about this new behavior, please refer to the official blog post.
Monitoring URLs have changed¶
If you have any custom links or bookmarks that point to the old monitoring module pages, please update them in order to point to the new IcingaDB Web pages.
Monitoring View Permission Changes¶
Due to the migration of monitoring to IcingaDB, some permissions related to the Monitoring View
module have been modified.
Since the sections in the host and service pages changed slightly, the permission monitoringview/problem-handling
is no longer applicable and it has been removed.
To replace its functionality, the following new permissions have been added:
monitoringview/actionsmonitoringview/commentsmonitoringview/downtimesmonitoringview/groupsmonitoringview/services
In order to maintain the previous behavior, and avoid to hide sections that were previously visible,
these new permissions are automatically granted to roles that had the monitoringview/problem-handling permission.
Additionally, the permission monitoringview/check-execution has been renamed to
monitoringview/check-statistics without any change in functionality.
Prerequisites¶
Before starting the upgrade, carefully read the latest release notes on NetEye’s blog and check the features that will change or be deprecated.
All NetEye packages installed on a currently running version must be updated according to the update procedure prior to running the upgrade.
NetEye must be up and running in a healthy state.
Disk Space required:
3GB for
/and/var150MB for
/boot
If the NetEye Elastic Stack module is installed:
The rubygems.org domain should be reachable by the NetEye Master only during the update/upgrade procedure. This domain is needed to update additional Logstash plugins and thus is required only if you manually installed any Logstash plugin that is not present by default.
1. Run the Upgrade¶
The Cluster Upgrade is carried out by running the following command:
cluster# (nohup neteye upgrade &) && tail --retry -f nohup.out
Warning
If the NetEye Elastic Stack feature module is installed and a new version of Elasticsearch is available, please note that the procedure may take a while to upgrade the Elasticsearch cluster. For more information on the Elasticsearch cluster upgrade and how to customize the upgrade process, please consult the dedicated section.
After the command was executed, the output will inform if the upgrade was successful or not:
In case of successful upgrade you might need to restart the nodes to properly apply the upgrades. If the reboot is not needed, please skip the next step.
In case the command fails refer to the troubleshooting section.
2. Reboot Nodes¶
Restart each node, one at a time, to apply the upgrades correctly.
Run the reboot command
cluster-node-N# neteye node reboot
In case of a standard NetEye node, put it back online once the reboot is finished
cluster-node-N# pcs node unstandby --wait=300
You can now reboot the next node.
3. Cluster Reactivation¶
At this point you can proceed to restore the cluster to high availability operation.
Bring all cluster nodes back out of standby with this command on the last standard node
cluster# pcs node unstandby --all --wait=300 cluster# echo $?
0If the exit code is different from 0, some nodes have not been reactivated, so please make sure that all nodes are active before proceeding.
Run the checks in the section Checking that the Cluster Status is Normal. If any of the above checks fail, please contact our service and support team before proceeding.
Re-enable fencing on the last standard node, if it was enabled prior to the upgrade:
cluster# pcs property set stonith-enabled=true
4. Additional Tasks¶
In this release, there are no additional tasks required after upgrading NetEye.