Skip to content

Commit

Permalink
Merge pull request #1724 from grafana/document_csv_module
Browse files Browse the repository at this point in the history
Document experimental csv module
  • Loading branch information
oleiade committed Sep 24, 2024
2 parents 2d53634 + 4be6313 commit 18bd228
Show file tree
Hide file tree
Showing 5 changed files with 348 additions and 0 deletions.
57 changes: 57 additions & 0 deletions docs/sources/next/javascript-api/k6-experimental/csv/Options.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
title: 'Options'
description: 'Options represents the configuration for CSV parsing.'
weight: 40
---

# Options

The `Options` object describes the configuration available for the operation of parsing CSV files using the [`csv.parse`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parse) function and the [`csv.Parser`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parser) class.

## Properties

| Property | Type | Description |
| :------------ | :---------------- | :-------------------------------------------------------------------------------------------------------- |
| delimiter | string | The character used to separate fields in the CSV file. Default is `','`. |
| skipFirstLine | boolean | Whether to skip the first line of the CSV file. Default is `false`. |
| fromLine | (optional) number | The line number from which to start reading the CSV file. Default is `0`. |
| toLine | (optional) number | The line number at which to stop reading the CSV file. If the option is not set, then read until the end. |

## Example

{{< code >}}

```javascript
import { open } from 'k6/experimental/fs';
import csv from 'k6/experimental/csv';

let file;
let parser;
(async function () {
file = await open('data.csv');
parser = new csv.Parser(file, {
delimiter: ',',
skipFirstLine: true,
fromLine: 2,
toLine: 8,
});
})();

export default async function () {
// The `next` method attempts to read the next row from the CSV file.
//
// It returns an iterator-like object with a `done` property that indicates whether
// there are more rows to read, and a `value` property that contains the row fields
// as an array.
const { done, value } = await parser.next();
if (done) {
throw new Error('No more rows to read');
}

// We expect the `value` property to be an array of strings, where each string is a field
// from the CSV record.
console.log(done, value);
}
```

{{< /code >}}
96 changes: 96 additions & 0 deletions docs/sources/next/javascript-api/k6-experimental/csv/Parser.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
---
title: 'Parser'
description: 'A CSV parser for streaming CSV parsing, allowing line-by-line reading with minimal memory consumption.'
weight: 30
---

# Parser

The `csv.Parser` class provides a streaming parser that reads CSV files line-by-line, offering fine-grained control over the parsing process and minimizing memory consumption.
It's well-suited for scenarios where memory efficiency is crucial or when you need to process large CSV files without loading the entire file into memory.

## Asynchronous nature

The `csv.Parser` class methods are asynchronous and return Promises.
Due to k6's current limitation with the [init context](https://grafana.com/docs/k6/<K6_VERSION>/using-k6/test-lifecycle/#the-init-stage) (which doesn't support asynchronous functions directly), you need to use an asynchronous wrapper such as:

{{< code >}}

```javascript
import { open } from 'k6/experimental/fs';
import csv from 'k6/experimental/csv';

let file;
let parser;
(async function () {
file = await open('data.csv');
parser = new csv.Parser(file);
})();
```

{{< /code >}}

## Constructor

| Parameter | Type | Description |
| :-------- | :-------------------------------------------------------------------------------------------- | :-------------------------------------------------------------------------------------------------------- |
| file | [fs.File](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/fs/file) | A file instance opened using the fs.open function. |
| options | [Options](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/fs/options) | An optional parameter object to customize the parsing behavior. Options can include a delimiter (string). |

### Methods

| Name | Description |
| :------- | :---------------------------------------------------------------------------------------------------- |
| `next()` | Reads the next line from the CSV file and returns a promise that resolves to an iterator-like object. |

### Returns

A promise resolving to an object with the following properties:

| Property | Type | Description |
| :------- | :------- | :---------------------------------------------------------------------------------------------------- |
| done | boolean | Indicates whether there are more rows to read (false) or the end of the file has been reached (true). |
| value | string[] | Contains the fields of the CSV record as an array of strings. If done is true, value is undefined. |

## Example

{{< code >}}

```javascript
import { open } from 'k6/experimental/fs';
import csv from 'k6/experimental/csv';

export const options = {
iterations: 10,
};

let file;
let parser;
(async function () {
file = await open('data.csv');
parser = new csv.Parser(file, { skipFirstLine: true });
})();

export default async function () {
// The `next` method attempts to read the next row from the CSV file.
//
// It returns an iterator-like object with a `done` property that indicates whether
// there are more rows to read, and a `value` property that contains the row fields
// as an array.
const { done, value } = await parser.next();
if (done) {
throw new Error('No more rows to read');
}

// We expect the `value` property to be an array of strings, where each string is a field
// from the CSV record.
console.log(done, value);
}
```

{{< /code >}}

## Notes on usage

- **Memory efficiency**: Since `csv.Parser` reads the file line-by-line, it keeps memory usage low and avoids loading the entire set of records into memory. This is particularly useful for large CSV files.
- **Streaming control**: The streaming approach provides more control over how records are processed, which can be beneficial for complex data handling requirements.
112 changes: 112 additions & 0 deletions docs/sources/next/javascript-api/k6-experimental/csv/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
title: 'csv'
description: 'k6 csv experimental API'
weight: 10
---

# csv

{{< docs/shared source="k6" lookup="experimental-module.md" version="<K6_VERSION>" >}}

The `k6-experimental/csv` module provides efficient ways to handle CSV files in k6, offering faster parsing and lower memory
usage compared to traditional JavaScript-based libraries.

This module includes functionalities for both full-file parsing and streaming, allowing users to choose between
performance and memory optimization.

## Key features

- The [`csv.parse()`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parse) function parses a complete CSV file into a SharedArray, leveraging Go-based processing for better performance and reduced memory footprint compared to JavaScript alternatives.
- The [`csv.Parser`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parser) class is a streaming parser that reads CSV files line-by-line, optimizing memory usage and giving more control over the parsing process through a stream-like API.

### Benefits

- **Faster parsing**: The [`csv.parse()`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parse) function bypasses the JavaScript runtime, significantly speeding up parsing for large CSV files.
- **Lower memory usage**: Both [`csv.parse()`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parse) and [`csv.Parser`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parser) support shared memory across virtual users (VUs) when using the [`fs.open()`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/fs/open) function.
- **Flexibility**: Users can choose between full-file parsing with [`csv.parse`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parse) for speed or line-by-line streaming with [`csv.Parser`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parser) for memory efficiency.

### Trade-offs

- The [`csv.parse()`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parse) function parses the entire file during the initialization phase, which might increase startup time and memory usage for large files. Best for scenarios where performance is more important than memory consumption.
- The [`csv.Parser`](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parser) class processes the file line-by-line, making it more memory-efficient but potentially slower due to the overhead of reading each line. Suitable for scenarios where memory usage is critical or more granular control over parsing is needed.

## API

| Function/Object | Description |
| ------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------- |
| [csv.parse()](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parse) | Parses an entire CSV file into a SharedArray for high-performance scenarios. |
| [csv.Parser](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/csv/parser) | A class for streaming CSV parsing, allowing line-by-line reading with minimal memory consumption. |

## Example

### Parsing a full CSV File into a SharedArray

{{< code >}}

```javascript
import { open } from 'k6/experimental/fs';
import csv from 'k6/experimental/csv';
import { scenario } from 'k6/execution';

export const options = {
iterations: 10,
};

let file;
let csvRecords;
(async function () {
file = await open('data.csv');

// The `csv.parse` function consumes the entire file at once and returns
// the parsed records as a `SharedArray` object.
csvRecords = await csv.parse(file, { delimiter: ',' });
})();

export default async function () {
// `csvRecords` is a `SharedArray`. Each element is a record from the CSV file, represented as an array
// where each element is a field from the CSV record.
//
// Thus, `csvRecords[scenario.iterationInTest]` will give us the record for the current iteration.
console.log(csvRecords[scenario.iterationInTest]);
}
```

{{< /code >}}

### Streaming a CSV file line-by-line

{{< code >}}

```javascript
import { open } from 'k6/experimental/fs';
import csv from 'k6/experimental/csv';

export const options = {
iterations: 10,
};

let file;
let parser;
(async function () {
file = await open('data.csv');
parser = new csv.Parser(file);
})();

export default async function () {
// The parser `next` method attempts to read the next row from the CSV file.
//
// It returns an iterator-like object with a `done` property that indicates whether
// there are more rows to read, and a `value` property that contains the row fields
// as an array.
const { done, value } = await parser.next();
if (done) {
throw new Error('No more rows to read');
}

// We expect the `value` property to be an array of strings, where each string is a field
// from the CSV record.
console.log(done, value);
}
```

{{< /code >}}
82 changes: 82 additions & 0 deletions docs/sources/next/javascript-api/k6-experimental/csv/parse.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---
title: 'parse( file, [options] )'
description: 'parse a CSV file into a SharedArray'
weight: 20
---

# parse( file, [options] )

The `csv.parse` function parses an entire CSV file at once and returns a promise that resolves to a [SharedArray](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-data/sharedarray) instance.
This function uses Go-based processing, which results in faster parsing and lower memory usage compared to JavaScript alternatives.
It's ideal for scenarios where performance is a priority, and the entire CSV file can be loaded into memory.

## Asynchronous Nature

`csv.parse` is an asynchronous function that returns a Promise. Due to k6's current limitation with the [init context](https://grafana.com/docs/k6/<K6_VERSION>/using-k6/test-lifecycle/) (which
doesn't support asynchronous functions directly), you need to use an asynchronous wrapper like this:

{{< code >}}

```javascript
import { open } from 'k6/experimental/fs';
import csv from 'k6/experimental/csv';

let file;
let csvRecords;
(async function () {
file = await open('data.csv');
csvRecords = await csv.parse(file, { delimiter: ',' });
})();
```

{{< /code >}}

## Parameters

| Parameter | Type | Description |
| :-------- | :-------------------------------------------------------------------------------------------- | :-------------------------------------------------------------- |
| file | [fs.File](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/fs/file) | A file instance opened using the `fs.open` function. |
| options | [Options](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/fs/options) | An optional parameter object to customize the parsing behavior. |

## Returns

A promise resolving to a [SharedArray](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-data/sharedarray) instance, where each element is an array representing a CSV record, and each sub-element is a field from that record.

## Example

{{< code >}}

```javascript
import { open } from 'k6/experimental/fs';
import csv from 'k6/experimental/csv';
import { scenario } from 'k6/execution';

export const options = {
iterations: 10,
};

let file;
let csvRecords;
(async function () {
file = await open('data.csv');

// The `csv.parse` function consumes the entire file at once and returns
// the parsed records as a `SharedArray` object.
csvRecords = await csv.parse(file, { skipFirstLine: true });
})();

export default async function () {
// `csvRecords` is a `SharedArray`. Each element is a record from the CSV file, represented as an array
// where each element is a field from the CSV record.
//
// `csvRecords[scenario.iterationInTest]` gives the record for the current iteration.
console.log(csvRecords[scenario.iterationInTest]);
}
```

{{< /code >}}

## Notes on Usage

- **Memory considerations**: `csv.parse` loads the entire CSV file into memory at once, which may lead to increased memory usage and startup time for very large files.
- **Shared memory usage**: The [SharedArray](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-data/sharedarray) returned by `csv.parse` is shared among all Virtual Users (VUs), reducing memory overhead when multiple VUs access the same data.
1 change: 1 addition & 0 deletions docs/sources/next/shared/javascript-api/k6-experimental.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ title: javascript-api/k6-experimental

| Modules | Description |
| ------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------- |
| [csv](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/fs) | Provides support for efficient and convinient of parsing CSV files. |
| [fs](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/fs) | Provides a memory-efficient way to handle file interactions within your test scripts. |
| [redis](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/redis) | Functionality to interact with [Redis](https://redis.io/). |
| [streams](https://grafana.com/docs/k6/<K6_VERSION>/javascript-api/k6-experimental/streams) | Provides an implementation of the Streams API specification, offering support for defining and consuming readable streams. |
Expand Down

0 comments on commit 18bd228

Please sign in to comment.