Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CSV file reading functionality to the workspace file system #740

Merged
merged 3 commits into from
Sep 30, 2024

Conversation

pelikhan
Copy link
Member

@pelikhan pelikhan commented Sep 30, 2024

This pull request adds the ability to read CSV files to the workspace file system. It includes the implementation of the readCSV method in the WorkspaceFileSystem interface, as well as the necessary changes in the codebase to support reading CSV files. The readCSV method can be used to read the content of a CSV file and optionally specify the delimiter and headers. This functionality enhances the capabilities of the workspace file system and provides developers with a convenient way to read and parse CSV files in their projects.


  • The new readCSV method is introduced 🧑‍💻. It's used to read the content of a file as CSV.
  • The readCSV method is added to WorkspaceFileSystem in prompt_template.d.ts, making it a part of the public API and ready to use by the user. User-facing change! 🌐
  • readCSV is also added and made accessible in promptcontext.ts and workspace.genai.mjs.
  • The CSVParseOptions interface is introduced for user-friendly and structured representation of CSV parsing options. User-facing change! 🛠️
  • A correction of a typo in the usage of the retrieval command in commands.md.
  • A blurb about readCSV, complete with usage examples, is added to the docs in files.md, enhancing the documentation 📖.
  • Code in workspace.genai.mjs is updated to test the new readCSV function.
  • Some refactoring is done within filesystem.ts to facilitate the CSV parsing functionality 👷‍♂️.
  • No critical changes were applied to imports as per instructions ✋.

generated by pr-describe

@@ -235,9 +235,9 @@ Options:

## `retrieval`

```
Usage: genaiscript retrieval|retreival [options] [command]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a typo here.

generated by pr-docs-review-commit typo


In Typescript, you can type the output.

```ts '<{ name: string; value: number }>'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect code fence, Typescript type annotation should not be in a string format.

generated by pr-docs-review-commit incorrect_code_fence

Copy link

The GIT_DIFF presents changes in three different files namely filesystem.ts, promptcontext.ts and prompt_template.d.ts.

The changes in the filesystem.ts file add a readCSV method to the createFileSystem function. This method reads text from a CSV file and parses it into a usable data structure.

In the promptcontext.ts file, the createPromptContext function has been updated to include the readCSV method.

In the prompt_template.d.ts file, a readCSV method has been added to the WorkspaceFileSystem interface and a new interface CSVParseOptions has been added as well, which seems to be used for specifying options for CSV parsing.

There are no functional issues detected in the changes. All changes seem to logically follow from introducing the readCSV method.

In conclusion, the changes seem to be well-integrated, preserving the existing structure and extending the functionality in a logical manner.

LGTM 🚀

generated by pr-review

Copy link

Investigator report

, b2da42c

Root Cause Analysis

The failure occurs due to a TypeScript error in the file ../core/src/filesystem.ts at line 72 related to the readCSV function. The issue arises because the return type Promise<object[]> does not match the expected return type Promise<T[]> where T extends object.

Here's the problematic code snippet:

readCSV: async (f: string | Awaitable<WorkspaceFile>, options) => {

The expected type comes from the property readCSV which is declared in ../core/src/types/prompt_template.d.ts with the signature:

readCSV<T extends object>(
    path: string | Awaitable<WorkspaceFile>,
    options?: CSVParseOptions
): Promise<T[]>

The error indicates that object[] cannot be assigned to T[] because object is a broader type and T could be a more specific subtype.

Suggested Fix

Here's a suggested diff to resolve the type mismatch:

--- a/core/src/filesystem.ts
+++ b/core/src/filesystem.ts
@@ -71,7 +71,7 @@
     {
         readCSV: async <T extends object>(f: string | Awaitable<WorkspaceFile>, options): Promise<T[]> => {
             const csvData = await someCSVParsingFunction(f, options);
-            return csvData as object[];
+            return csvData as T[];
         },
     }

Explanation

  • The proposed fix casts the result of the CSV parsing operation to T[] instead of object[]. This ensures that the function signature matches the expected return type specified in the type declaration.

generated by gai

@@ -235,9 +235,9 @@ Options:

## `retrieval`

```
Usage: genaiscript retrieval|retreival [options] [command]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code block starting with 'Usage: genaiscript retrieval|retreival [options] [command]' should be fenced with triple backticks to maintain consistency with the document formatting.

generated by pr-docs-review-commit missing_code_fence

): Promise<T[]> => {
const file = await fs.readText(f)
const res = CSVParse(file.content, options) as T[]
return res

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no error handling for the async function readCSV. If the file read operation or CSV parsing fails, it could lead to unhandled promise rejection. Consider adding a try-catch block. 😊

generated by pr-review-commit missing_error_handling

@@ -64,6 +64,7 @@ export async function createPromptContext(
readText: (f) => runtimeHost.workspace.readText(f),
readJSON: (f) => runtimeHost.workspace.readJSON(f),
readXML: (f) => runtimeHost.workspace.readXML(f),
readCSV: (f) => runtimeHost.workspace.readCSV(f),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function readCSV is missing type annotations for its argument and return value. This could lead to confusion about the expected input and output. Please add type annotations. 😊

generated by pr-review-commit missing_type

readCSV<T extends object>(
path: string | Awaitable<WorkspaceFile>,
options?: CSVParseOptions
): Promise<T[]>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The readCSV function in the WorkspaceFileSystem interface is missing a detailed comment explaining its purpose, parameters, and return value. This could make it harder for other developers to understand its usage. Please add a descriptive comment. 😊

generated by pr-review-commit missing_comment

@pelikhan pelikhan merged commit 7296b07 into main Sep 30, 2024
10 of 11 checks passed
@pelikhan pelikhan deleted the readcsv branch September 30, 2024 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant