Upload CSV Bulk Import File
The upload-csv-bulk-import-file
command may be used to upload a CSV file to an ETL Importer module by submitting it to the ETL Import Endpoint.
This command performs CSV-to-FHIR mapping in the Smile CDR server process. See also Map and Upload CSV Bulk Import File for an alternate command that performs the mapping in the SmileUtil client process.
bin/smileutil upload-csv-bulk-import-file -b "username:password" -f "/path/to/sourcefile.csv" -u "http://localhost:9000" -i "etl_importer"
-f [filename or directory]
(or --filename
) – This argument should point to an individual file or to a directory (in which case all files in the directory will be processed). Note that any files with an extension ending in .gz
or .bz2
will be expanded during processing.-i [module id]
(or --module-id
) – This argument should supply the ID of the ETL module on the same node as the JSON Admin API module.-u [url]
(or --url
) – The base directory for the JSON Admin API server.-b [username:password]
(or --basic-auth [username:password]
) – (optional) If specified, provides a username and password that will be supplied to the server in an HTTP Basic Authorization
header in the form of "username:password"
. If the value supplied is "PROMPT
", smileutil will prompt the user to enter credentials interactively.-m [path]
(or --move-after
) – (optional) If supplied, source files will be moved to the given directory after they have been uploaded.-s [number]
(or --split-rows
) – (optional) If supplied, this is the number of rows to send in each batch. See Sending Batches for a Single File below.-q [number]
(or --quit-after
) – (optional) If supplied, the command will exit after processing this number of files.-r [number]
(or --retry-count
) – (optional) If supplied, the command will automatically retry after any failures, up to the specified number of times. This is useful in cases where network problems might interrupt a failed upload. For example, if this parameter is supplied a value of 2
up to three attempts will be made to deliver an individual file before aborting.-k [count]
(or --skip-rows
) – (optional) If specified, the command will skip the first N rows instead of delivering them. Note that the very first row in a given file is assumed to be a header and is not skipped or counted. Also note that if multiple files are being transmitted using a directory as the source value, the first N rows across all files are skipped (not the first N rows in each file).-j [string]
(or --user-job-type
) – (optional) If specified, the string supplied by this argument is transparently passed through Smile CDR, and is made available to the mapping script to assist in processing multiple data structures.--request-id
– (optional) This argument populates the value for the X-Request-Id header, and can be used for matching server logs against a specific smileutil invocation.The -s
or --split-rows
argument may be used
Each row is indeed its own transaction, and this is not affected by the "-s" parameter. The -s command affects how many rows get sent to the server at a time. The only reason we want to send batches of rows (as opposed to sending the whole file at once) is so that the server can send back progress to the user. In other words, say the file has 1,000,000 rows. If we send them all at once, we need to wait until all 1,000,000 rows are processed before smileutil gets any response back from the server, so there isn't much in the way of visible signs of progress. On the other hand, if we break this up into increments of 1,000, the user gets lots of feedback since they will see an update every 1000 rows. Other than this difference, there is no effect on performance or behavior from the "-s" argument.