Table of Contents |
---|
Overview
csv_extraction is a generalized Lambda for extracting content from a particular Orbita schema into a CSV file written to an S3 bucket.
Lambda Event Properties
environment (string, required)
The name of the environment as specified in environments.json. New environments will need their connection information added to environments.json.
projectID (string, required)
The ID of the project in Orbita on the target environment that the content will be extracted from.
s3Bucket (string, required)
The path to the S3 bucket that the file will written to.
filename (string|object, required)
There are two potential formats for the filename parameter:
...
Code Block |
---|
"filename": { "prefix": "outbound/activity-", "dateFormat": "YYYYMMDD" } |
schema (object, required)
An object indicating the target Orbita content schema to extract, with the following properties:
key (string, required) - The key of the schema, which can be found by looking at the schema details in Orbita:
type (string, required) - The type of the schema, either “content” or “dynamic”
dateRange (object, optional)
An object indicating a date range to limit extracted records to:
...
Code Block |
---|
"dateRange": { "days": 1, "field": "createdAt" } |
filter (object, optional)
An object that will be applied directly as a Mongo query to further filter the included records.
Info |
---|
If dateRange and filter are both specified, they will be merged together into one Mongo query that is applied when pulling records (with the dateRange query fields taking precedence over any matching fields in the filter query). |
mapping (array of field mappings, optional)
An array of mappings of fields from the Orbita schema to headers in the written CSV file, using the following properties:
...
Note |
---|
If a mapping is specified, only the fields in the mapping will be included in the extract. |
Example Event
Here is a complete example of the JSON for an event that could be used to run the csv_extraction Lambda:
Code Block |
---|
{ "environment": "clientName-prod", "projectID": "6075d174b497b60079cXXXXX", "s3Bucket": "private-orbitahealth/clients/clientName/oe/.oe/customdata/6075d174b497b60079cXXXXX/sftp", "filename": { "prefix": "outbound/survey-results-", "dateFormat": "YYYYMMDD" }, "schema": { "key": "survey", "type": "dynamic" } "dateRange": { "days": 1, "field": "modifiedAt" }, "filter": { "isComplete": true }, "mapping": [{ "from": "q1", "to": "Question 1" },{ "from": "q2", "to": "Question 2" },{ "from": "q3", "to": "Question 3" }] } |
environments.json
environment.json is a configuration file in the Lambda source that contains the connection information for various environments.
...
Code Block |
---|
{ "name": "client-prod", "connectionString": "mongodb+srv://username:password@instance.mongodb.net", "databaseName": "dbname" } |
Scheduling Ingestions
To have an extraction run on a scheduled basis, put in a request to dev ops to create a new scheduled CloudWatch event (on whatever schedule is necessary) to trigger the csv_extraction Lambda, and provide them an event object as outlined in Lambda Event Properties and Example Event that the CloudWatch event will pass when invoking the Lambda.