Table of Contents | ||||
---|---|---|---|---|
|
Overview
csv-ingestion is a generalized Lambda for ingesting content (typically user records) from a CSV file located in an S3 bucket on the Orbita account into an Orbita project using a particular content schema.
Anchor | ||||
---|---|---|---|---|
|
description (string, optional)
A description of the particular ingestion process (ex. “CVS Pharmacy Virtual Assistant - Lisinopril - Production”) which is used in notifications if specified. If omitted but notifications are enabled, the environment name will be used.
notify (array of emails, optional)
Email addresses to send a notification to when the process runs.
environment (string, required)
The name of the environment as specified in environments.json. New environments will need their connection information added to environments.json.
projectID (string, required)
The ID of the project in Orbita on the target environment that the content will be ingested in to.
s3Bucket (string, required)
The path to the S3 bucket that the file will be read from.
filename (string|object, required)
There are two potential formats for the filename parameter:
...
Code Block |
---|
"filename": { "prefix": "inbound/patients-", "dateFormat": "YYYYMMDD" } |
encoding (string, optional)
The encoding of the file to be ingested. Acceptable values are:
“utf8” (Default)
“utf16le”
Anchor | ||||
---|---|---|---|---|
|
An object indicating the target Orbita content schema, with the following properties:
key (string, required) - The key of the schema, which can be found by looking at the schema details in Orbita:
type (string, required) - The type of the schema, either “content” or “dynamic”
Anchor | ||||
---|---|---|---|---|
|
Indicates whether or not a request should be made after ingestion to trigger an Elastic search index of the primary schema.
nullValues (array of strings, optional)
An array of string values that should be considered a null value when evaluating required fields.
Anchor | ||||
---|---|---|---|---|
|
An array of mappings of fields in the ingested CSV file to fields in the Orbita schema, using the following properties:
...
Code Block |
---|
{ "from": "PTNT_LAST_NM", "to": "lastName", "required": true }, { "from": "PTNT_DT_OF_BRTH", "to": { "field": "dateOfBirth", "type": "date" }, "required": true }, { "from": "Opportunity Type", "to": "drug", "required": true, "acceptableValues": [ "Lisinopril" ] } |
Anchor | ||||
---|---|---|---|---|
|
An array of objects indicating that certain fields in the Orbita schema should always be initialized to a particular value, using the following properties:
...
Code Block |
---|
"mapping": [{ "field": "campaignStatus", "value": "new" }, { "field": "hasEngaged", "value": false }, { "field": "channel", "value": null }] |
secondaryRecords (array of secondary record definitions, optional)
An array of definitions for secondary records to be created, containing the schema, mappings from the primary record, and any static field initializations:
...
Code Block |
---|
"secondaryRecords": [ { "schema": { "key": "patientstatus", "type": "content" }, "mapping": [ { "from": "_id", "to": "patientId" } ], "staticFields": [ { "field": "status", "value": "active" } ] } ] |
Anchor | ||||
---|---|---|---|---|
|
Here is a complete example of the JSON for an event that could be used to run the csv-ingestion Lambda:
Code Block |
---|
{ "description": "Client Name - Referral Ingestion (Prod Admin)", "notify": [ "andrew.merola@orbita.ai", "mark.cline@orbita.ai" ], "environment": "clientName-prod", "projectID": "6075d174b497b60079cXXXXX", "s3Bucket": "private-orbitahealth/clients/clientName/oe/.oe/customdata/6075d174b497b60079cXXXXX/sftp", "filename": { "prefix": "inbound/referrals-", "dateFormat": "YYYYMMDD" }, "schema": { "key": "referral", "type": "content" }, "mapping": [ { "from": "PATIENT_ID", "to": "patientID", "required": true }, { "from": "LASTNAME", "to": "lastName", "required": true }, { "from": "FIRSTNAME", "to": "firstName", "required": true }, { "from": "EMAIL", "to": "email" }, { "from": "PHONE", "to": { "field": "phone", "type": "phone" } }, { "from": "DATE_OF_SERVICE", "to": { "field": "dateOfService", "type": "date" }, "required": true } ], "staticFields": [ { "field": "campaignStatus", "value": "new" }, { "field": "hasEngaged", "value": false }, { "field": "channel", "value": null }, { "field": "unsubscribed", "value": false } ] } |
Anchor | ||||
---|---|---|---|---|
|
environment.json is a configuration file in the Lambda source that contains the connection information for various environments.
...
Code Block |
---|
{ "name": "client-prod", "connectionString": "mongodb+srv://username:password@instance.mongodb.net", "databaseName": "dbname", "host": "orbita-instance-name.orbita.cloud:8443", "serviceAccount": { "username": "some-user@orbita.cloud", "password": "theOrbitaPassword" } } |
Scheduling Ingestions
To have an ingestion run on a scheduled basis, put in a request to dev ops to create a new scheduled CloudWatch event (on whatever schedule is necessary) to trigger the csv-ingestion Lambda, and provide them an event object as outlined in Lambda Event Properties and Example Event that the CloudWatch event will pass when invoking the Lambda.
...