50.16.1smileutil: Upload CSV Bulk Import File

The upload-csv-bulk-import-file command may be used to upload a CSV file to an ETL Importer module by submitting it to the ETL Import Endpoint.

This command performs CSV-to-FHIR mapping in the Smile CDR server process. See also Map and Upload CSV Bulk Import File for an alternate command that performs the mapping in the SmileUtil client process.

50.16.2Usage

bin/smileutil upload-csv-bulk-import-file -b "username:password" -f "/path/to/sourcefile.csv" -u "http://localhost:9000" -i "etl_importer"

50.16.3Options

-f [filename or directory] (or --filename) – This argument should point to an individual file or to a directory (in which case all files in the directory will be processed). Note that any files with an extension ending in .gz or .bz2 will be expanded during processing.
-i [module id] (or --module-id) – This argument should supply the ID of the ETL module on the same node as the JSON Admin API module.
-u [url] (or --url) – The base directory for the JSON Admin API server.
-b [username:password] (or --basic-auth [username:password]) – (optional) If specified, provides a username and password that will be supplied to the server in an HTTP Basic Authorization header in the form of "username:password". If the value supplied is "PROMPT", smileutil will prompt the user to enter credentials interactively.
-m [path] (or --move-after) – (optional) If supplied, source files will be moved to the given directory after they have been uploaded.
-s [number] (or --split-rows) – (optional) If supplied, this is the number of rows to send in each batch. See Sending Batches for a Single File below.
-q [number] (or --quit-after) – (optional) If supplied, the command will exit after processing this number of files.
-r [number] (or --retry-count) – (optional) If supplied, the command will automatically retry after any failures, up to the specified number of times. This is useful in cases where network problems might interrupt a failed upload. For example, if this parameter is supplied a value of 2 up to three attempts will be made to deliver an individual file before aborting.
-k [count] (or --skip-rows) – (optional) If specified, the command will skip the first N rows instead of delivering them. Note that the very first row in a given file is assumed to be a header and is not skipped or counted. Also note that if multiple files are being transmitted using a directory as the source value, the first N rows across all files are skipped (not the first N rows in each file).
-j [string] (or --user-job-type) – (optional) If specified, the string supplied by this argument is transparently passed through Smile CDR, and is made available to the mapping script to assist in processing multiple data structures.
--request-id – (optional) This argument populates the value for the X-Request-Id header, and can be used for matching server logs against a specific smileutil invocation.

50.16.4Sending Batches for a Single File

The -s or --split-rows argument may be used to send the CSV file in smaller batches for better progress updates.

Each row is indeed its own transaction, and this is not affected by the "-s" parameter. The -s command affects how many rows get sent to the server at a time. The only reason we want to send batches of rows (as opposed to sending the whole file at once) is so that the server can send back progress to the user. In other words, say the file has 1,000,000 rows. If we send them all at once, we need to wait until all 1,000,000 rows are processed before smileutil gets any response back from the server, so there isn't much in the way of visible signs of progress. On the other hand, if we break this up into increments of 1,000, the user gets lots of feedback since they will see an update every 1000 rows. Other than this difference, there is no effect on performance or behavior from the "-s" argument.

50.16.5CSV-to-FHIR Mapping Configuration

The CSV-to-FHIR mapping is configured in the ETL Importer module on the server side. This configuration includes:

A JavaScript mapper script with a handleEtlImportRow(inputMap, context) function that transforms CSV rows into FHIR resources 2. Optional configuration for CSV delimiters, headers, and other formatting options 3. Options for batching, transactions, and error handling

For detailed information on configuring the server-side mapping, see the ETL Import Module documentation.

50.15 Upload Bundle Files 50.17 Upload Sample Dataset