Data Integrity Issues

From DataSelf Knowledge Base
Jump to navigation Jump to search

Data integrity issues can cause several types of data import issues such as:

  • The whole auto refresh hangs
  • The auto refresh quits before completing
  • The auto refresh completes but have incomplete data (ex.: one table was not completely refreshed).

DataSelf ETL Logging of Data Integrity Issues

Data integrity issues might prevent the ETL from completing the refresh process.

However, depending on the type of issue, the ETL skips the importing of records with offending data, and then creates log files that show exactly what the offending records and fields are.

To access those log files, go to \Program Files (x86)\DataSelf\DataSelf ETL\ErrorDumps. You'll find an html file for each table that has data integrity issues. Open an html file to see the records that were NOT imported in the data warehouse, and the offending data in red fonts.


Invalid Date Fields

DataSelf uses MS SQL Server as its database engine. MS SQL Server only accepts date values that meet the following rules:

  • Values between 1/1/1753 to 12/31/9999.
  • Values consistent with calendar periods (examples of invalid dates: Feb 29 for non-leap years, or April 31, or a month outside 1 to 12 range).
  • In most cases, NULLs are accepted.

It's not uncommon for certain source databases to have date values that don't fit the rules above. That typically happens when the entry of date fields are not properly validated.

When that happens, we recommend running a report (such as Crystal or Excel) from the table having issues directly from the source database. Pull all date fields into the report:

  • To find date values outside of the accepted range: Run the report and sort it by each of the date fields at a time. Sort ascending and descending and see if you find invalid date values.
  • To find dates not consistent with calendar periods: This can get a bit trickier. You may use Excel formulas to extract year, month and days from a date field and then analyze them apart (ex.: sort the month column and see if you find any outside of the 1 to 12 range).
  • IMPORTANT: With Providex databases, some issues will require Crystal Reports (Excel will NOT show them). In Crystal Reports, create YEAR() formulas from all date fields: values that come up as 0.0 are bad (blank dates are Ok). Another way is to export the file contents to a txt file using Sage 100's Visual Integrator module. Once you have that file you can open it in Excel and filter on the date fields. This will usually show up an incorrect date field.

If you find issues, be sure to clean them up in the source system or database. If this kind of issue continues to occur on an ongoing basis, we recommend setting up procedures in your source system to validate dates or removing the offending date fields from DataSelf BI.

After you fix the issue, it's likely that DataSelf BI will refresh successfully. If not, please contact support@dataself.com.

Non-numeric Values in Numeric Fields

DataSelf uses MS SQL Server as its database engine. MS SQL Server only accepts numeric values in numeric fields.

In some source databases, you may find non-numeric values within fields that are supposed to be for numeric only. This is particularly frequent when importing data from Excel data sources.

When that happens, we recommend running a report (such as Excel) from the table having issues directly from the source database and pull all numeric fields into the report:

  • To find non-numeric values: Run the report and sort it by each of the numeric fields at a time. Sort ascending and descending and see if you find non-numeric values. Sometimes you'll see numbers at the beginning or end of the sorting list, but they are out of order. In this case, it's likely that Excel is treating that value as a character (there's probably a non-numeric character inside of that particular Excel cell, example " 123" instead of "123").

If you find issues, be sure to clean them up in the source system or database.

After you fix the issue, it's likely that DataSelf BI will refresh successfully. If not, please contact support@dataself.com.