Schema Setup

Overview of how to prepare schemas for identity resolution.

Overview

In order to make the resolution process scalable across agencies, a standardized identity resolution schema is used to canonicalize data fields across systems. Attest's identity resolution API defines a canonical identity schema for use across agencies.In order to meet the requirements of this identity resolution schema, each participating agency MUST define a schema mapping file for its existing system of record.

Mapping schema files ensures the identity resolution API is able to:

Understand the naming conventions of specific fields across agencies
Use a single standard field name and definition for consistent matching results.

This process ensures field-level consistency in a scenario where Agency A uses the field nameGender whereas Agency B uses the field name Sex.

Constructing Schemas

Schema field objects are made up of the following structure:

"referenceIdentifier": {
    "field": "DOB"
    "format": "%Y/%M/%D"
}
...

// Object structure
"[resolver_field_name]": {
    "field": "[field_name_in_import_file]",
    "format": "[date_format_in_import_file]"
},
...

Schema files MUST use conventional JSON file structure and syntax standards. Schema files MUST contain the following fields in order to process the import file.

If an agency does not record an optional field such as middleName implementors must OMIT the field from the schema file.

Schema Field Requirements

Field

Description

Format

name

string

Name of the agency or organization

None

referenceIdentifier

integer

Agency/organization-specific reference identifier, unique per individual

None

firstName

string

First name of the individual

None

middleName *

string

Middle name of the individual

None

lastName

string

Last name of the individual

None

gender*

string

Gender of the individual

Variety of formats are accepted F/ Female

dateOfBirth

string

Date of birth of the individual

MUST match the import file format including use of "/" or "-"

%M/%D/%Y or

%Y-%M-%D

* Indicates a field that is not required in the schema file.

Example Schema Files

fusd_02_10_2020.json

{
  "name": "Fresno Unified",
  "referenceIdentifier": {
    "field": "StudentID"
  },
  "firstName": {
    "field": "FirstName"
  },
  "middleName": {
    "field": "MiddleName"
  },
  "lastName": {
    "field": "LastName"
  },
  "gender": {
    "field": "Gender"
  },
  "dateOfBirth": {
    "field": "DateOfBirth",
    "format": "%m/%d/%y"
  }
}

fph_02_10_2020.json

{
  "name": "Fresno Public Health",
  "referenceIdentifier": {
    "field": "Id"
  },
  "firstName": {
    "field": "First"
  },
  "middleName": {
    "field": "Middle"
  },
  "lastName": {
    "field": "Last"
  },
  "gender": {
    "field": "Sex"
  },
  "dateOfBirth": {
    "field": "DOB",
    "format": "%m/%d/%y"
  }
}

eoc_02_10_2020.json

{
  "name": "Economic Opportunities Commission",
  "referenceIdentifier": {
    "field": "Id"
  },
  "firstName": {
    "field": "FirstName"
  },
  "lastName": {
    "field": "LastName"
  },
  "gender": {
    "field": "GenderCode"
  },
  "dateOfBirth": {
    "field": "Birthday",
    "format": "%m/%d/%y"
  }
}

Schema/CSV Naming Conventions

The JSON file and corresponding agency CSV must be named exactly the same so the mapping can occur effectively, for example:

// JSON schema file
fusd_02_10_2020.json

// Data import file
fusd_02_10_2020.csv

Not using the same naming convention for the two will return the following error:

Issue processing file: The schema file "fusd_02_10_2020.json" for "fusd_02_10_2020.csv" does not exist, please ensure it exists before depositing the CSV.

The sandbox environment used for testing purposes can be easily reset, however, a lightweight naming/versioning process when creating the schema file SHOULD be used. An example naming convention could be the agency_abbrevation plus the date of the upload/testing:

 // JSON schema file
 [agency_abbreviation]_[month]_[day]_[year].json
 
 // Data import file 
 [agency_abbreviation]_[month]_[day]_[year].csv

Validating JSON Schema Files

Prior to executing the adapter locally, to check for any JSON formatting or JSON schema errors. Implementors SHOULD use a JSON schema validator tool to validate the structure of the schema file. A variety of schema validators are available online including the following:

JSON Schema Validator - Newtonsoftjamesnk

Using the examples below, a valid JSON schema file constructed for an agency should return a valid result when compared to the schema definition file.

{
  "definitions": {},
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "http://example.com/root.json",
  "type": "object",
  "title": "The Root Schema",
  "required": [
    "name",
    "referenceIdentifier",
    "firstName",
    "lastName",
    "dateOfBirth"
  ],
  "properties": {
    "name": {
      "$id": "#/properties/name",
      "type": "string",
      "title": "The global name to reference.",
      "default": "",
      "examples": [
        "An agency name."
      ],
      "pattern": "^(.*)$"
    },
    "referenceIdentifier": {
      "$id": "#/properties/referenceIdentifier",
      "type": "object",
      "title": "The reference identifier of the source data.",
      "required": [
        "field"
      ],
      "properties": {
        "field": {
          "$id": "#/properties/referenceIdentifier/properties/field",
          "type": "string",
          "title": "The Field Schema",
          "default": "",
          "examples": [
            "Identifier"
          ],
          "pattern": "^(.*)$"
        }
      }
    },
    "firstName": {
      "$id": "#/properties/firstName",
      "type": "object",
      "title": "The name of the first name field to map from.",
      "required": [
        "field"
      ],
      "properties": {
        "field": {
          "$id": "#/properties/firstName/properties/field",
          "type": "string",
          "title": "The Field Schema",
          "default": "",
          "examples": [
            "FirstName"
          ],
          "pattern": "^(.*)$"
        }
      }
    },
    "middleName": {
      "$id": "#/properties/middleName",
      "type": "object",
      "title": "The name of the middle name field to map from.",
      "required": [
        "field"
      ],
      "properties": {
        "field": {
          "$id": "#/properties/middleName/properties/field",
          "type": "string",
          "title": "The Field Schema",
          "default": "",
          "examples": [
            "MiddleName"
          ],
          "pattern": "^(.*)$"
        }
      }
    },
    "lastName": {
      "$id": "#/properties/lastName",
      "type": "object",
      "title": "The name of the last name field to map from.",
      "required": [
        "field"
      ],
      "properties": {
        "field": {
          "$id": "#/properties/lastName/properties/field",
          "type": "string",
          "title": "The Field Schema",
          "default": "",
          "examples": [
            "LastName"
          ],
          "pattern": "^(.*)$"
        }
      }
    },
    "gender": {
      "$id": "#/properties/gender",
      "type": "object",
      "title": "The name of the gender field to map from.",
      "required": [
        "field"
      ],
      "properties": {
        "field": {
          "$id": "#/properties/gender/properties/field",
          "type": "string",
          "title": "The Field Schema",
          "default": "",
          "examples": [
            "Sex"
          ],
          "pattern": "^(.*)$"
        }
      }
    },
    "dateOfBirth": {
      "$id": "#/properties/dateOfBirth",
      "type": "object",
      "title": "The date of birth field to map from.",
      "required": [
        "field"
      ],
      "properties": {
        "field": {
          "$id": "#/properties/dateOfBirth/properties/field",
          "type": "string",
          "title": "The Field Schema",
          "default": "",
          "examples": [
            "DOB"
          ],
          "pattern": "^(.*)$"
        },
        "format": {
          "$id": "#/properties/dateOfBirth/properties/format",
          "type": "string",
          "title": "The Format of the date of birth field.",
          "default": "",
          "examples": [
            "%m/%d/%y"
          ],
          "pattern": "^(.*)$"
        }
      }
    }
  }
}

{
  "name": "Fresno Unified",
  "referenceIdentifier": {
    "field": "StudentID"
  },
  "firstName": {
    "field": "FirstName"
  },
  "middleName": {
    "field": "MiddleName"
  },
  "lastName": {
    "field": "LastName"
  },
  "gender": {
    "field": "Gender"
  },
  "dateOfBirth": {
    "field": "DateOfBirth",
    "format": "%m/%d/%y"
  }
}

Troubleshooting Invalid JSON Schemas

In the event that a schema file created does not conform to the JSON schema file specification listed above, the adapter will return the following error:

:= Waiting for file to be dropped:
****> Issue processing file: The schema file "something.json" for "something.csv" does not exist, please ensure it exists before depositing the CSV.
:= Waiting for file to be dropped:

If this error is returned be sure to complete the step labeled Validating JSON Schema Files.

PreviousIntroduction NextData Preparation

Last updated 5 years ago

Was this helpful?

hashtagOverview

hashtagConstructing Schemas

hashtagSchema Field Requirements

hashtagExample Schema Files

hashtagSchema/CSV Naming Conventions

hashtagValidating JSON Schema Files

hashtagTroubleshooting Invalid JSON Schemas

Overview

Constructing Schemas

Schema Field Requirements

Example Schema Files

Schema/CSV Naming Conventions

Validating JSON Schema Files

Troubleshooting Invalid JSON Schemas