Skip to content

Commit

Permalink
supporting gzip checks (#28)
Browse files Browse the repository at this point in the history
* supporting gzip checks

* update documentation explaining new functionality

* Update README.md

Co-authored-by: Mint Thompson <mathompson@mitre.org>

---------

Co-authored-by: Mint Thompson <mathompson@mitre.org>
  • Loading branch information
shaselton-usds and mint-thompson authored Jan 2, 2025
1 parent 5a1d926 commit 0f6d172
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 3 deletions.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,11 +54,13 @@ Overriding the default error limit to show all errors and warnings:
```sh
cms-hpt-validator ./sample.csv v2.0.0 -e 0
```
### Machine-readable File Extensions
The two current allowable file formats for the HPT MRFs are CSV and JSON. The CLI will auto detect the file format passed into the tool for files that end with `.csv` or `.json` and will run the appropriate validator for that file. The CLI can also detect files compressed by gzip. Files ending with the `.gz` extension will be decompressed before validation. These file format detections can be combined, so files ending with `.csv.gz` or `.json.gz` will be decompressed and validated as CSV or JSON, respectively. For other files ending with `.gz`, use the `-f` option described above.


## Limitations
There may be a situation in which the CLI tool will run out of memory due to the amount of errors that are found in the file being validated. If you run into this NODE error, update the amount of errors to a smaller value that will be allowed to be collected with the `-e, --error-limit` flag.


## Contributing

Thank you for considering contributing to an Open Source project of the US
Expand Down
15 changes: 13 additions & 2 deletions src/commands.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import fs from "fs"
import path from "path"
import chalk from "chalk"
import zlib from "zlib"
import {
CsvValidationOptions,
JsonValidatorOptions,
Expand All @@ -24,7 +25,13 @@ export async function validate(
return
}

const inputStream = fs.createReadStream(filepath, "utf-8")
const inputStream = filepath.endsWith(".gz")
? fs
.createReadStream(filepath)
.pipe(zlib.createGunzip())
.setEncoding("utf-8")
: fs.createReadStream(filepath, "utf-8")

const validationResult = await validateFile(
inputStream,
version,
Expand Down Expand Up @@ -64,7 +71,7 @@ export async function validate(
}

async function validateFile(
inputStream: fs.ReadStream,
inputStream: fs.ReadStream | NodeJS.ReadableStream,
version: string,
format: FileFormat,
validatorOptions: CsvValidationOptions | JsonValidatorOptions
Expand Down Expand Up @@ -93,6 +100,10 @@ function getFileFormat(
): FileFormat | null {
if (fileFormat.format) return fileFormat.format as FileFormat

if (filepath.endsWith(".gz")) {
filepath = filepath.slice(0, -3)
}

const fileExt = path.extname(filepath).toLowerCase().replace(".", "")
if (["csv", "json"].includes(fileExt)) {
return fileExt as FileFormat
Expand Down

0 comments on commit 0f6d172

Please sign in to comment.