Monday, March 20, 2017

Tips about Data Import Export Framework performance


Architecture

When Microsoft started the investments to build the Data Import Export Framework (DIXF), they considered many performance choices. In addition it should provide a solution for the normalized data model where many tables are linked using a foreign key relation based on record IDs.
The source is now first loaded into a staging table without any business logic like events for validating data or add rules in insert methods. Then from the staging table the data is copied to the target table(s) where also business logic will be used. In an older version of Microsoft Dynamics AX, I had to deal with importing over 60000 fixed assets with each 4 value models, acquisition value and cumulated depreciation. So in total there were over 240000 value models and almost 500000 journal lines to be posted for the opening balance. Usually within one flat Excel file with a certain template I used a script to read the Excel lines, create assets, post acquisition values, depreciation values and then correct value models for the number of remaining depreciation periods. This script was working where at a maximum 400 assets were loaded.  The 240000 value models and 500000 transactions should take about 6-8 days for processing according to a calculation. Then we did also create a staging table which contained the Excel columns. From within AX we could process the business logic using the batch framework which solved the problem and the transactions could be converted within the given timeframe. So this architecture is a good implementation.
A very cool part of the DIXF framework is the use of SQL Server Integration Services (SSIS) to get the source into the staging tables. Microsoft did a real amazing good job here. The next picture shows the flow in general. It does not explain which part of the DIXF binary components (dll extensions) will take care of what part of the integration with SSIS.
Data Import Export Framework PerformanceThe setup of processing groups in AX is the base for SSIS packages which will pick up the source records and will put it with our without some basic transformations in the staging table. SSIS is very efficient and really a winner on performance for this type of tasks.

Performance tips

However Microsoft seems to have invested a lot in performance for the Data Import Export Framework, you might encounter some performance issues. You also need to be aware of some standard behavior and features to get the best performance on executing the import jobs. It would be recommended to not change objects when there is no real need for more performance. If you make changes on your environment based on this blog post, make sure you test it thoroughly. Changing your environment is at your own risk.

Staging tips

  • Index usage
    Make sure the unique index starts with the fields DefinitionGroup and ExecutionID. E.g. in various AX 2012 environments the staging table for Ledger opening balances (DMFLedgerJournalEntity) has an index which does not start with these fields. This will cause write and read action being slower. This is causing fragmented indexes. These two fields are key for Dynamics AX to have records grouped per processing group and execution steps. When these fields are not first fields in the index, it would be like searching in a phone book that is sorted on the phone number instead of the city and last name.
    When you use the wizard to create a new entity you have to check the primary index as the two mentioned fields may not be on the correct place. But like I said, also check existing entities.
    Performance tips Data Import Export Framework
    Disable obsolete indexes. This has less impact on performance compared to the previous tip. An obsolete index will be updated, but could potentially help when you filter within the staging history details. So try to estimate the impact before going into this direction.
  • Conversion/Auto generated
    When possible avoid usage of converting values and auto numbering. This will lead to some additional variables in the SSIS packages. If it would be possible to pre-fill some values in the source, it would be quicker during the source to staging process.

Target tips

  • Number of tasks when running in batch can be set to divide work over multiple batch threads and servers. If you have a file with a very large number of records, you can setup e.g. 24 tasks. When there are 240000 records it would create tasks with bundles of 10000 records. See also the next two items as these are related to this tip.
  • When possible you can increase the number of batch servers during data migration. Usually a customer environment can have e.g. 4 AOS servers where one was setup to run as a batch server. It would be possible to temporary add the other AOS servers to act also as batch server. Don’t install multiple instances of an AOS on the same server as they would then have to share the CPU capacity.
  • Maximum of threads on batch server. You can test if adding more threads or just reducing the number would be of benefit for the performance. The default value is 8. Some sites mentions that  2 threads per core, is recommended but you can try to play with this setting.
  • Prevent usage of Run business logic and validations. Use it only when really required. E.g. inventory journal entities need to have business logic enabled on insert methods. But when there is no need to call insert, update or validation methods, don’t enable them.
  • Temporary disable indexes on target table(s) which are not set to be unique. A nice example is the Journal lines table (LedgerJournalTrans. This table is a record holder with the number of indexes. When you disable them during the staging to target execution step, they will not be maintained during the data import. After the import has been completed, you can enable them which will rebuild the balances much faster that during write actions of the data import.
    Performance tips Data Import Export Framework

There is more…

There can be more possibilities of improving performance. Some settings can be related to SQL server tuning, but also the hardware you are using. Also there are two settings in the DIXF parameters which could cause performance problems when you change it to the wrong settings. One field is Data access mode. This field is a setting for SSIS. When it has NOT the fast load option, the records are committed one by one. So use the fast load setting. When a post man has 10 letters for the same address, he can insert them one by one or all at once. The last option is comparable with the fast load option. The Maximum insert commit size field is used to tell how many records will be inserted during before a commit command will be executed. E.g. the mailbox has a height to only insert 5 letters. Then 2 inserts are needed to put in the 10 letters. The default value is 2147483647 which actually means there is no limitation and al records will be committed at once. When you have e.g. a limited size for your TempDB on the SQL server, you may need to verify this setting to e.g. have 100000 records per commit action.
Performance tips Data Import Export Framework



No comments:

Post a Comment

Export a copy of the standard user acceptance testing (UAT) database

 Reference link: Export a copy of the standard user acceptance testing (UAT) database - Finance & Operations | Dynamics 365 | Microsoft ...