Data Preparation

How to cleanse data prior to importing data files.

Data Cleansing

Before running the service, implementors SHOULD review through the import data files for any explicit anomalies. Anomalies that result in invalid rows may contains records where a referenceIdentifier was provided, but did not contain one or more of the required fields such as first firstName, last name or sex.

A few examples of records that will not validate (anonymized for privacy purposes) include:

Example Records

Error Description

351233, Mark, Smith, M, 1899-12-31

Anomalous DOB format

525389,ZzAna, ZzGomez,F, 10/10/10

Addition of Zz may indicate duplicates

423452,TRUE,Sam, Ng, F, 1/22/94

TRUE in FirstName field

These types of records SHOULD be corrected manually or completely removed from the import data set.

The adapter's processing of the JSON schema file ensures these types they do not affect the end result and will indicate any field-type errors . Processing a file with invalid fields will return the following error and will not

Invalid records that need to be corrected before processing:
{525389, ZzAna, ZzMaria, ZzGomez, F, 10/10/10}

Entity-Specific Reference Identifiers

An entity-specific referenceIdentifier is required for every record. These reference identifiers MUST be unique per record and specific to the import file's system of record.

In the case that an import file does not have a referenceIdentifier available, an autogenerated field of type uint (unsigned 64 bit integer) or type id MUST be provided. The adapter will not process import files that will not. Providing an import file without a reference identifier will return the following error:

Invalid records that need to be corrected before processing:
{ , Jane, Emily, Doe, F, 09/22/95}

PreviousSchema Setup NextAdapter Setup & Execution

Last updated 5 years ago

Was this helpful?

hashtagData Cleansing

hashtagEntity-Specific Reference Identifiers

Data Cleansing

Entity-Specific Reference Identifiers