User Community for HCL Informix
  • Home
  • Blogs
  • Forum
  • About
  • Contact
  • Resources
  • Events

Informix Warehouse Accelerator for Cloud (3)

1/31/2018

1 Comment

 
Picture
                                                                                                            Updated: 8/30/18

In this post we look at methods to define data marts in the Informix Warehouse Accelerator (IWA). IWA is a software appliance for the Informix database server, using in-memory, columnar data storage for business intelligence applications doing OLAP. For such workloads IWA typically can achieve a query acceleration of 10 - 100 times faster than running the workload within the Informix database server.

Part 3: Defining lWA Data Marts
In the last post we used the IWA Example Database to demonstrate creation and administration of   a data mart. This was quite easy because we used a ready-made data mart definition. Additionally, the example database itself is pretty simple. It serves well for explaining and understanding some basic concepts, but real use case scenarios are much more complex. While the administration  steps for data marts always remain simple enough, finding the proper definition for a data mart can  be quite difficult. And if the data mart is to accelerate some report that consists of hundreds of  queries this does not make things easier. Not all of the queries may be acceleratable. Figuring out which ones are and from there the tables and columns needed in the data mart can be a herculean task. Sometimes such reports are 'grown' over many years and there may be nobody around knowing all the details.
 
Brute Force 
One approach is to include all tables with all their columns in the data mart - or at least all columns  of supported data types. However, this probably is not optimal and may even not be possible because the resulting data mart may be too big. Often there are some lengthy description columns never used by the typical OLAP queries. Such columns are prime candidates for omission from a data mart, making the data mart smaller and the data loading faster. Identifying these columns is a problem, especially when there are hundreds or even thousands of tables in the database.
 
Workload Analysis 
Workload analysis attacks the problem from the other side: Analyze a given workload (e.g. one or several reports), filter all the queries that can be accelerated and determine just the tables and columns that these queries need. For this task we use built-in functionality, called "query probing", in the Informix database server. The "query probing" does the filtering of acceleratable queries and automatically collects all the necessary information for building a data mart definition, including the references between tables. This is especially useful when no primary-foreign relational constraints are defined, which often is the case in a data warehouse.
 
We can further combine the "query probing" with the SQL Tracing feature. This also collects timing information on the individual queries in a workload. Obviously, we need to actually execute the workload in order to get the timings. Once we have all the information from the "query probing" and the SQL Tracing, we can see which of the long running queries can be accelerated. We may even  be able to determine long running queries that cannot be accelerated and perhaps can change    them in some way or other to make them acceleratable.
 
Finally, we use the integrated function "ifx_probe2mart()" to automatically create a data mart  definition from the collected query probing data. From there it is then straightforward to create the  data mart itself. The data mart load step after the creation will show how big the data mart is and whether at all it fits into IWA's memory. If the data mart turns out to be too big, we can scrutinize the data mart definition to determine some column or table that we can remove as a compromise. We can also search in the collected SQL Trace statements for that table to see, which queries need the table or column we would like to remove and what run time they had. After all, we may not want to remove something that is needed by the longest running query and thus making it non-acceleratable.

Workload Analysis Example
 
Let's put workload analysis to work in a very simple example. We again use the IWA Example Database named "iwadb" from the last post. As workload we use the three example queries in the files "q1.sql", "q2.sql" and "q3.sql" that come with it. Following are the steps:

0. Preparation: Update statistics and create a logging database named "martdb" for the data mart definition:
Database

    
1. Clear probing data possibly left over from previous probings, turn on query probing and run the workload without actually executing it:
Database

    
2. Check for collected probing data:
Database

    
3. From the collected probing data create a data mart definition for a data mart with name "new_mart" in the logging database "martdb":
Database

    
4. Extract the data mart definition to an XML file named "new_mart.xml":
Database

    
Few thoughts on the above example:
 
We need to perform the preparation step (0) only once, even if later we want to adapt the data mart definition to a changed query workload, e.g. when new queries get added to the    workload.
 
Probing data is kept in memory only. After a restart of the Informix server all previously collected probing data will be gone.
 
For the actual probing we run the statements to set the environment and the actual workload     in the same dbaccess session. This is necessary, because the shown environment settings     are valid only for the duration of the session. Otherwise, it is also possible to set the variable "use_dwa" for a specific (existing) session. Alternatively, the setting can be placed in a specific "sysdbopen()" procedure.
 
As we probe the workload without really executing it, we do not use SQL Tracing in this example. The SQL Tracing anyway would not produce timing information.
 
The command "onstat -g probe" basically shows the raw probing information with little formatting to make it 'readable'. The command is useful to just check if any probing data was collected. In depth interpretation of the data is something for Tech Support.
 
To perform the procedure "ifx_probe2Mart()", we need to be connected to the logging

database that we created in the preparation step. If a definition for the data mart already exists, any newly probed tables or columns will be added to the existing definition.
 
Therefore, if we want to get rid of an existing data mart definition in the logging database, we have to manually delete it with an SQL statement like the following:
Database

    
When extracting the data mart definition as XML file, we use the '!' to avoid that the function "lotofile()" adds a string to make the file name unique between repeated runs.
 
In the last blog post I promised to touch on more conceptual details and administration tasks for  data marts in IWA. Today's topic of defining a data mart does not exactly fit my promise, but at least it is an important aspect of the overall concept. I will certainly return to more details in a future blog post.

Martin Fuerderer
Advisory Development Engineer
Connect with me on LinkedIn

Informix is a trademark of IBM Corporation in at least one jurisdiction and is used under license.
1 Comment
https://www.researchwritingking.com/review-of-studymoose-com/ link
4/20/2020 04:56:05 am

I need this warehouse accelerator if I want to up my production. We do not have a lot of budget to make production more efficient, so this is the best that we can do. I believe that this is quite enough, though. Our operations are not that big, so we still need to work on our logistical approach to our manufacturing capabilities. I am trying so hard to increase our production, but it is not easy. I hope that this warehouse accelerator does the trick.

Reply



Leave a Reply.

    Archives

    November 2019
    September 2019
    May 2019
    April 2019
    February 2019
    January 2019
    October 2018
    July 2018
    April 2018
    March 2018
    February 2018
    January 2018
    December 2017
    November 2017
    October 2017
    September 2017
    August 2017
    July 2017
    June 2017
    May 2017

    Categories

    All
    Business
    Technical

    RSS Feed

Proudly powered by Weebly
  • Home
  • Blogs
  • Forum
  • About
  • Contact
  • Resources
  • Events