User Community for HCL Informix
  • Home
  • Blogs
  • Forum
  • About
  • Contact
  • Resources
  • Events

High Availability Made Easy: ifxclone

1/18/2018

1 Comment

 
Picture
                                                                                                            Updated: 8/30/18
​
Hopefully you have an established routine of local backups or cloud backups, and you should be sleeping a little better (when you get a chance to sleep).  And like fire-drills or other emergency procedures, you are probably practicing a restore of your backups on another server.  You have probably discovered, depending on the size of your data and speed of the system, this restore may take hours.  Moving from ontape to onbar helped by enabling parallel backup and restore of your dbspaces, but while backups can happen with the lights on, during a restore, the server won’t be accessible until the restore is complete.  You may not be able to afford for your applications to be down that long.  

You don’t have to just be reacting to a failure, you can proactively reduce down-time by utilizing one or more of the Informix Technology Replication technologies discussed in the “Informix Replication Technology” blog by Nagaraju Inturi.  With High-availability Data Replication (HDR) your database operations can switch to a secondary server in the event of a hardware or disk failure on your primary system.  Each of the “for-purchase” licenses authorize the use of at least one High-Availability server.  If you are not using any of the “for-purchase” editions, or want to follow along without touching your actual system, the HDR and RSS technology is available between two instances of the Developer Edition.  This blog shows how to implement an HDR secondary server using the “ifxclone” command, after making a few simple changes to your current server’s configuration, without stopping it (this is of course about availability). 

Overview of HDR Components 

One server plays the role of “Primary”, and all updates made to logging Database tables, are recorded in the log.  The Primary sends those log updates to each of the “secondary” servers, where it will apply those changes to the physical disk.  This continually keeps the data consistent with the data on the primary.  

Picture
If the connection between the Primary and the Secondary is too slow, or unstable, this can impact the performance on the primary.  For these situations, creating an Remote Secondary Standby server (RSS) will allow the Primary to use asynchronous communications to the secondary, with the trade-off being a risk that a primary failure could happen before some committed transaction are recorded on the RSS machine.  Therefore, HDR is your High-Availability option, and RSS is better for fast disaster recovery.  Some “for-purchase” licenses entitle the use of HDR and RSS simultaneously. 

Limitations/Restrictions 

HDR and RSS replication technology uses the logs to keep the secondary system up to date.   Therefore, for data to be replicated, it must be stored in logging databases and in spaces with logging.  It cannot be in BLOB spaces (no logging is done for BLOB data), non-logging smartblob spaces, or external storage.   Other restrictions to configuration of the servers are in the Administrators Guide. 

Prerequisites 

1. HARDWARE:  
      A second machine with: 
  • Sufficient storage capacity to match the storage allocated on the original machine
  • Adequate memory and processing capacity to handle your workload in event of the primary server failing, or needing to be shutdown
2. SOFTWARE: 
  • Install the same version of Informix software on the second machine as the original machine
  • Install all User-defined types, user-defined routines, and DataBlade modules that are installed on the original/primary machine
    (They do not need to be registered on the secondary server)
3. Customized Scripts:
  • alarmprogram.sh 
  • evidence.sh
  • (etc.) 
4. Ownership and Permission of storage paths:
  • Owner and group should match the primary server (informix:informix)
  • Directories containing cooked files must have 770 permissions

Common Setup (from Source Server) 

Some changes and files will be the same on both servers.  We will make the changes to the files on “host1” (our “server1” machine), and then “ifxclone” will copy them to the new target host (host2).  We are avoiding naming our systems “primary” and “secondary” because these are roles that can switch between the servers over time, and perhaps for long periods of time. 

Trusted Server Connection 
The primary server must trust the connection from the new database server.  We need mutual trust when the roles of the two servers reverse.  While you can do this at the OS level, we will assume you want this trust limited to the database service.  This can be done by creating a file you identify in the ONCONFIG file’s REMOTE_SERVER_CFG parameter in the $INFORMIXDIR/etc directory.  For this example, will create a file named "trusted.hosts" in the "$INFORMIXDIR/etc" directory.   This also allows HCL​ Informix tools to update the trust information later. 

Create an empty “trusted.hosts” file, and set the file permissions to limit write access to the instance owner(informix) and group (informix): 


    
Now configure the server to use the new file: 


    
Enable “ifxclone” and the “sysadmin:admin” function to setup the connectivity information on all servers in the cluster (for now it is just one). 


    
You can manually edit the file the first time, but we will execute the “admin” SQL function in the “sysadmin” database using dbaccess to update the REMOTE_SERVER_CFG on all servers in the cluster (I am also adding our current server as trusted so either server can initiate a connection to the other, and this trust information will eventually be copied to our new server too):
Database

    
HDR ONCONFIG Changes
Next disable temp table logging, and enable snapshot copy to be made by the “ifxclone” command: 


    
NOTE:  
  1. The TEMPTAB_NOLOG setting is only really required on the secondary.  However, when the roles are reversed, leaving this enabled would affect applications that depend on rollback on TEMP Tables.  So, either plan to correct immediately after making a new primary, or be consistent and don’t rely on TEMP Table transactions. 
  2. If you are going to setup an RSS server instead of HDR secondary, we need to enable the logging of index builds.  This causes all index creation operations to go through the logs (hope you have automatic log backups configured): 
Database

    
INFORMIXSQLHOSTS
The “ifxclone” command will attempt to add the new server to the INFORMIXSQLHOSTS file, and if it already exists will not properly modify the file to define a server group the way we want.  If you already have the entry for the new server we should delete it, or comment it out for now:
Database

    
Database Logging Mode 
Only databases using logging will be replicated. Databases using buffered logging can lose transactions if the log has not been flushed to disk and the server fails.  This same window of vulnerability extends to the replica as well.  If all databases are logging, or you don’t need the non-logging databases to be replicated, you can move on to the next step.  You can verify the list of the non-logging databases, and those that are using buffered logging by running the following query against the “sysmaster” database table “sysdatabases” using “dbaccess”: 

Database

    
If no rows are produced, then all databases are using unbuffered logging.  For each database you want replicated that is currently non-logging (or is using buffered logging), you can convert them to unbuffered logging mode during the ontape system backup command using the “ontape” option “-U” and the list of database names to change to Unbuffered-Logging.  For example, change “db1” and “db2” to Unbuffered-Logging during the Level 0 system backup requires the following ontape command:
Database

    
Notes: 
  1. If you are not using ontape for backups, then you must use the “ondblog” command to change logging mode, and then onbar for Level 0 backup. 
  2. Logging Mode Change requires an exclusive lock on the DB, so plan for the DB to be unavailable. 
  3. If you later create a non-logging, it will appear to exist on the secondary, but it will not be usable on the secondary. 
  4. To change a non-logging DB to Logging after Data Replication has been started is a pain, you must 
  • Stop the secondary servers 
  • Change the logging mode and take a Level 0 backup
  • Restore the secondary from backup (or re-clone)
With Data Replication (DR) you should create all new Databases with some form of logging (ANSI, Unbuffered, or Buffered). 

Chunk Path/Device Information 

While we are still on server1, you should collect the path information for the chunks since these paths need to exist on the new server.  You can collect this with the “onstat -d” command.  We will discuss the output during setup of server2. 

Target Setup 

Most, if not all, of the new server configuration can be handled by “ifxclone”: 
  1. Copies the REMOTE_SERVER_CFG trusted host information
  2. Copies the ONCONFIG: Due to the number of configuration parameters that need to be the same, it is probably easiest to let “ifxclone” copy the ONCONFIG from the original server.  Otherwise you need to verify each of the ONCONFIG parameters which must be identical (listed in the Administrators Reference) have been correctly set.  If needed you can have “ifxclone” override some settings using the “-c” option. 
  3. Setup the source and local INFORMIXSQLHOST file
The “ifxclone” command cannot setup the symlinks to raw devices, or chunkfiles if they are in directories that do not exist. 

Create Needed Directories and Symlinks to Raw Devices
All dbspace chunk paths used on the primary must be the same on our new server.    You will need to create the matching symlinks to the corresponding physical devices.  For cooked files, the “ifxclone” can create the missing chunk files automatically, even for the root dbspace, but it will not create the parent directories.  The parent directories for those paths must already exist for “ifxclone” to successfully create the chunk files.  

You can find the paths in the “onstat -d” output from “server1”.  For this example, “onstat -d” produced the following output, showing that the chunk paths are all under the “/chunkdir” directory:
Picture
Then, for each of those paths, make sure the parent directories exist with adequate permissions.  The above example only needs the “/chunkdir” directory to exist.  It also must have the ownership, and permissions, so you will need to do these operations as root/administrator:
Database

    
Under those directories you can manually create the needed raw-device symlinks (need permission of 660, same as any cooked files you choose to create manually.) 

If the list of chunks is long, you can collect the paths via the following command on “server1”, and use the output to create a script to handle the above steps for creating the required parent directories:
Database

    
Run “ifxclone”
The “ifxclone” will make “server1” a primary, perform “fake” backup of “server1”, and restore it to “server2” as a new HDR secondary.  All changes to logged spaces on “server1” will be continually applied to “server2”.   The “ifxclone” will add the secondary to the INFORMIXSQLHOSTS file on the “source” server, and copy the “source” server’s ONCONFIG information to the target:
Database

    
See the Administrator’s Reference and the section for “The ifxclone utility” for complete information on ifxclone.

Check the Progress
To monitor the Data Replication Information and message log, run the following (Ctrl-C to break out):

    
The server state at the top of each output burst should change from “Initialization”, to “Fast Recovery”, then eventually to “Read-Only (Sec)”, and show it is paired with server1 and the Data Replication state is “on”:
Picture
Server Failed to Initialize?
If the server failed to initialize, check the message log for errors.  Correct any missing “paths”, or permissions on the filesystem, or any other changes needed to the ONCONFIG file.  Then rerun the “ifxclone” command, but you must add “--useLocal” so you don’t overwrite any changes you fixed in the local ONCONFIG. 

INFORMIXSQLHOSTS was Updated
The “autoconf” option added a “group” to the server INFORMIXSQLHOSTS file, and assigned each server in the group using the “g=” option.  The group name is the name of the source server but with “g_” prepended.  The INFORMIXSQLHOSTS file from our example contains three lines now:
Database

    
Verify Backup Device Configuration
Verify the ONCONFIG LTAPEDEV and TAPEDEV values (copied from server1 to server2) exist and have the correct permissions (RWX for owner and group) on the new server.  This will ensure that any automatic and manual backups run as intended.  If you need to change them on the secondary later, you can use “onmode -wf” command while the server is running to update the value. 

Copying the alarmprogram.sh ensures that the same automatic log backup configuration is used on server2.  In the event server2 becomes the primary, the log backups are already configured for operation. 

Manual Failover 

In the event of a server1 being unavailable or you need to shut it down, the secondary can be made the primary.   On the secondary server, you can run the following command which will both shutdown the current primary, and make the secondary the new primary (force is needed if the primary is already “unavailable”): onmode -d make primary server2 force

Until you run the command to make the secondary the primary, server2 is read-only (even when configured as updatable secondary, which forwards updates to the primary). 

Making a new Secondary from old Primary 

Once you have “server1” working again, and it didn’t lose any data, it can become the “secondary” for the current primary (server2).  Just restart server1 with “oninit -PHY” (pretends it was physically restored, and waiting for logical recovery), make it the secondary for our current primary so it will recover logical logs from primary, and monitor the message log until the server is running as a secondary:
Database

    
If the server wasn’t down long, your cluster is complete again, with “server1” as your secondary.  However, if any of the needed logical logs are no longer available directly from the primary, the server will remain in “Fast Recovery (sec)” state, and message log will indicate you need to do a recovery from tape:
Database

    
In this event, you need to restore the missing logs from the log backups of server2.  For example, if you are using “ontape”, you could copy the server2 logical log backups to host1, and place them in the server1 configured LTAPEDEV location.  Since the backup names all start with a prefix indicating “host2” and server number “0”, we provide this information via the IFX_ONTAPE_FILE_PREFIX environment variable, and then run the “ontape” command to restore the logical logs: 

Database

    
If you lost data disk, you re-clone from “server2” by following the “Setup Target” steps again, but this time the target is “server1”, and the source is “server2”.  

Client Failover to Secondary 

When planning for server failover in the event of a primary failure, the clients connecting to this cluster should be prepared for connecting to the current primary server accepting transactions.  The traditional way for enabling this, without utilizing the Connection Manager, is using sqlhosts groups instead of server names for connecting. 

Connect to Server Group 
Client applications configured to connect to the “g_server1” INFORMIXSERVER group will automatically connect to the first server in the group, and if unavailable, it will connect to the next server in the group, until it connects to the current primary.  If the application automatically attempts to reconnect following a connection failure, the switch to a different server is transparent to users.  Otherwise restarting the application to force a new connection will find the current primary. 

Configuring client applications using JDBC URL or JDBC DataSource to connect to the server group need to include options to specify the SQLHOSTS file path as either a local file or a URL to retrieve the file, and type of path, or can be retrieved via LDAP. 

Here is an example JDBC URL specifying the INFORMIXSERVER as the server group, SQLH_TYPE, and SQLH_FILE (SQLHOSTS file is “/local/sqlhosts”): 


    
You can see this and additional details and options for connecting to HA servers via JDBC in the IBM Informix JDBC Driver Programmer's Guide. 

Further Reading: Connection Manager 
The connection manager can monitor the primary server availability, and automatically force a secondary to become the primary.  Client connecting to the connection manager will be directed to the primary server.  The connection manager can also implement rules to determine the best connection for clients. 

See the Administrator’s Guide and the section on “Connection management through the Connection Manager”.  

Kevin Mayfield
Senior Solutions Architect at HCL

Connect with me on LinkedIn

Informix is a trademark of IBM Corporation in at least one jurisdiction and is used under license.

1 Comment
Michal Lukaszewicz
1/25/2018 09:27:38 am

This is a very good and comprehensive reading. I was preparing myself to use ifxclone to speed up build process of dev HA boxes and after going through your article I must admit it looks very promising.
Thanks for sharing!

Reply



Leave a Reply.

    Archives

    November 2019
    September 2019
    May 2019
    April 2019
    February 2019
    January 2019
    October 2018
    July 2018
    April 2018
    March 2018
    February 2018
    January 2018
    December 2017
    November 2017
    October 2017
    September 2017
    August 2017
    July 2017
    June 2017
    May 2017

    Categories

    All
    Business
    Technical

    RSS Feed

Proudly powered by Weebly
  • Home
  • Blogs
  • Forum
  • About
  • Contact
  • Resources
  • Events