CSV Extraction
Overview
csv_extraction is a generalized Lambda for extracting content from a particular Orbita schema into a CSV file written to an S3 bucket.
Lambda Event Properties
environment (string, required)
The name of the environment as specified in environments.json. New environments will need their connection information added to environments.json.
projectID (string, required)
The ID of the project in Orbita on the target environment that the content will be extracted from.
s3Bucket (string, required)
The path to the S3 bucket that the file will written to.
filename (string|object, required)
There are two potential formats for the filename parameter:
A string containing the exact filename or file path to write to in the specified S3 bucket, which must have an extension of .csv (ex. “test.csv” or “inbound/test.csv”)
An object with the following two properties, that will result in a formatted CSV filename or path with today’s date (based on Eastern Time):
prefix (string, required) - The portion before the date stamp (ex. “patients-” or “inbound/patients-”)
dateFormat (string, required) - The moment.js date format to use to generate the date stamp (ex. “YYYYMMDD”)
As an example of the latter format, the following filename parameter, if run on October 27th, 2021 (in Eastern Time) would resolve to a target file path of “outbound/activity
-20211027.csv“:
"filename": {
"prefix": "outbound/activity-",
"dateFormat": "YYYYMMDD"
}
schema (object, required)
An object indicating the target Orbita content schema to extract, with the following properties:
key (string, required) - The key of the schema, which can be found by looking at the schema details in Orbita:
type (string, required) - The type of the schema, either “content” or “dynamic”
dateRange (object, optional)
An object indicating a date range to limit extracted records to:
days (number, required) - How many days of records to include.
field (string, required) - The Orbita field to filter based on. Valid values are “createdAt” and “modifiedAt”.
offset (number, optional) - An optional offset of a certain number of days from the time the lambda is being run.
For example, the following would extract records with a createdAt timestamp that falls in a window of 48 to 24 hours before the lambda is run:
"dateRange": {
"days": 1,
"field": "createdAt",
"offset": 1
}
Without the offset, it would cover records with a createdAt timestamp in the last 24 hours before the lambda was run:
"dateRange": {
"days": 1,
"field": "createdAt"
}
filter (object, optional)
An object that will be applied directly as a Mongo query to further filter the included records.
If dateRange and filter are both specified, they will be merged together into one Mongo query that is applied when pulling records (with the dateRange query fields taking precedence over any matching fields in the filter query).
mapping (array of field mappings, optional)
An array of mappings of fields from the Orbita schema to headers in the written CSV file, using the following properties:
from (string, required) - The field in the Orbita schema.
to (string, required) - The header to map to in the CSV file.
For example:
If a mapping is specified, only the fields in the mapping will be included in the extract.
Example Event
Here is a complete example of the JSON for an event that could be used to run the csv_extraction Lambda:
environments.json
environment.json is a configuration file in the Lambda source that contains the connection information for various environments.
An example of an entry:
Scheduling Ingestions
To have an extraction run on a scheduled basis, put in a request to dev ops to create a new scheduled CloudWatch event (on whatever schedule is necessary) to trigger the csv_extraction Lambda, and provide them an event object as outlined in Lambda Event Properties and Example Event that the CloudWatch event will pass when invoking the Lambda.