Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc/processor #141

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,7 @@ class YourController
### Advanced Usage

1. [Configuration](./docs/configuration.md)
2. [Processing (saving for example)](./docs/processing.md)
2. [Working with assets](./docs/assets.md)
3. [Builders API](./docs/builders_api.md)
4. [Async & Webhooks](./docs/webhook.md)
Expand Down
7 changes: 7 additions & 0 deletions composer-dependency-analyser.php
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,13 @@
->addPathToScan(__DIR__.'/src/Debug', isDev: true)
->addPathToScan(__DIR__.'/tests', isDev: true)

->ignoreErrorsOnPackage('async-aws/s3', [
ErrorType::DEV_DEPENDENCY_IN_PROD,
])
->disableExtensionsAnalysis() // TODO : Bug waiting for https://github.com/shipmonk-rnd/composer-dependency-analyser/issues/217
->ignoreErrorsOnPackage('league/flysystem', [
ErrorType::DEV_DEPENDENCY_IN_PROD,
])
->ignoreErrorsOnPackage('symfony/routing', [
ErrorType::DEV_DEPENDENCY_IN_PROD,
])
Expand Down
12 changes: 9 additions & 3 deletions composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -33,12 +33,16 @@
"symfony/mime": "^6.4 || ^7.0"
},
"require-dev": {
"ext-mbstring": "*",
"async-aws/s3": "^2.6",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth to lower the requirement

"friendsofphp/php-cs-fixer": "^3.41",
"league/flysystem": "^3.29",
"league/flysystem-bundle": "^3.3",
Comment on lines +39 to +40
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth to lower the requirement

"phpstan/extension-installer": "^1.3",
"phpstan/phpstan": "^1.10",
"phpstan/phpstan-symfony": "^1.3",
"phpunit/phpunit": "^10.4",
"shipmonk/composer-dependency-analyser": "^1.7",
"shipmonk/composer-dependency-analyser": "^1.8",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed to ignore ext-mbstring issues

"symfony/framework-bundle": "^6.4 || ^7.0",
"symfony/http-client": "^6.4 || ^7.0",
"symfony/monolog-bundle": "^3.10",
Expand All @@ -56,7 +60,9 @@
},
"suggest": {
"symfony/monolog-bundle": "Enables logging througout the generating process.",
"symfony/twig-bundle": "Allows you to use Twig to render templates into PDF",
"monolog/monolog": "Enables logging througout the generating process."
"symfony/twig-bundle": "Allows you to use Twig to render templates into PDF.",
"monolog/monolog": "Enables logging througout the generating process.",
"async-aws/s3": "Upload any file to aws s3 compatible endpoints supporting multi part upload without memory overhead.",
"league/flysystem-bundle": "Upload any file using this filesystem abstraction package."
}
}
11 changes: 10 additions & 1 deletion docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,10 @@ The default configuration for the bundle looks like :

Assuming you have the following client configured.

```yaml
<details>
<summary>app/config/framework.yaml</summary>

```yaml
# app/config/framework.yaml

framework:
Expand All @@ -19,8 +21,13 @@ framework:
base_uri: 'http://localhost:3000'
```

</details>

Then

<details>
<summary>app/config/sensiolabs_gotenberg.yaml</summary>

```yaml
# app/config/sensiolabs_gotenberg.yaml

Expand Down Expand Up @@ -1237,6 +1244,8 @@ sensiolabs_gotenberg:
# - { name: 'X-Custom-Header', value: 'custom-header-value' }
```

</details>

> [!TIP]
> For more information about the [PDF properties](https://gotenberg.dev/docs/routes#page-properties-chromium)
> or [screenshot properties](https://gotenberg.dev/docs/routes#screenshots-route).
Expand Down
90 changes: 90 additions & 0 deletions docs/processing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Processing

Let's say you want to save the PDF or Screenshot as a file, you will need to use a `Sensiolabs\GotenbergBundle\Processor\ProcessorInterface`.
To avoid loading the whole file content in memory you can stream it to the browser.

You can also hook on the stream and save the file chunk by chunk. To do so we leverage the [`->stream`](https://symfony.com/doc/current/http_client.html#streaming-responses) method from the HttpClientInterface and use a powerful feature from PHP Generators : [`->send`](https://www.php.net/manual/en/generator.send.php).

## Using FileProcessor

Useful if you want to store the file in the local filesystem.
Example when generating a PDF :

```php
use Sensiolabs\GotenbergBundle\GotenbergPdfInterface;
use Sensiolabs\GotenbergBundle\Processor\FileProcessor;
use Symfony\Component\DependencyInjection\Attribute\Autowire;
use Symfony\Component\Filesystem\Filesystem;
use Symfony\Component\HttpFoundation\Response;

#[Route(path: '/my-pdf', name: 'my_pdf')]
public function pdf(
GotenbergPdfInterface $gotenbergPdf,
Filesystem $filesystem,

#[Autowire('%kernel.project_dir%/var/pdf')]
string $pdfStorage,
): Response {
return $gotenbergPdf->html()
//
->fileName('my_pdf')
->processor(new FileProcessor(
$filesystem,
$pdfStorage,
))
->generate()
->stream()
;
}
```

This will save the file under `%kernel.project_dir%/var/pdf/my_pdf.pdf` once the file has been fully streamed to the browser.
If you are not streaming to a browser, you can still process the file using the `process` method instead of `stream` :

```php
use Sensiolabs\GotenbergBundle\GotenbergPdfInterface;
use Sensiolabs\GotenbergBundle\Processor\FileProcessor;
use Symfony\Component\Filesystem\Filesystem;

class SomeService
{
public function __construct(private readonly GotenbergPdfInterface $gotenbergPdf) {}

public function pdf(): \SplFileInfo
{
return $this->gotenbergPdf->html()
//
->fileName('my_pdf')
->processor(new FileProcessor(
new Filesystem(),
$this->getParameter('kernel.project_dir').'/var/pdf',
))
->generate()
->process()
;
}
}
```

This will return a `SplFileInfo` of the generated file stored at `%kernel.project_dir%/var/pdf/my_pdf.pdf`.

## Other processors

* `Sensiolabs\GotenbergBundle\Processor\AsyncAwsProcessor` : Upload using the `async-aws/s3` package. Uploads using the (multipart upload)[https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html] feature of S3. Returns a `AsyncAws\S3\Result\CompleteMultipartUploadOutput` object.
StevenRenaux marked this conversation as resolved.
Show resolved Hide resolved
* `Sensiolabs\GotenbergBundle\Processor\FlysystemProcessor` : Upload using the `league/flysystem-bundle` package. Returns a `callable`. This callable will return the uploaded content.
* `Sensiolabs\GotenbergBundle\Processor\ChainProcessor` : Apply multiple processors. Each chunk will be sent to each processor sequentially. Return an array of vaues returned by chained processors.
* `Sensiolabs\GotenbergBundle\Processor\NullProcessor` : Empty processor. Does nothing. Returns `null`.
* `Sensiolabs\GotenbergBundle\Processor\TempfileProcessor` : Creates a temporary file and dump all chunks into it. Return a `ressource` of said `tmpfile()`.

## Custom processor

A custom processor must implement `Sensiolabs\GotenbergBundle\Processor\ProcessorInterface` which require that your `__invoke` method is a `\Generator`. To receive a chunk you must assign `yield` to a variable like so : `$chunk = yield`.

The basic needed code is the following :

```php
do {
$chunk = yield;
// do something with it
} while (!$chunk->isLast());
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also explain that the returned value of the processor will be returned by the process method

StevenRenaux marked this conversation as resolved.
Show resolved Hide resolved
118 changes: 118 additions & 0 deletions src/Processor/AsyncAwsProcessor.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
<?php

namespace Sensiolabs\GotenbergBundle\Processor;

use AsyncAws\S3\Result\CompleteMultipartUploadOutput;
use AsyncAws\S3\S3Client;
use Psr\Log\LoggerInterface;
use Sensiolabs\GotenbergBundle\Exception\ProcessorException;

/**
* TODO : Might be worth adding "MultiPart" to the name as not all services supports the multi part upload.
*
* @implements ProcessorInterface<CompleteMultipartUploadOutput>
*/
final class AsyncAwsProcessor implements ProcessorInterface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, specific processors like AWS or Flysystem must be in a separated bridge to avoid adding dev-dependencies. It could block the update of GotenbergBundle to Symfony x.y if, for example, FlysystemBundle does not support it yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then they would have to require it just for one class potentially ? Seems a bit too much don't you think ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO such classes must be fixed on package version instead of framework version. So maybe rename it to include more clearly package name and version of said package ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK for a Bridge folder (like the Notifier for example) for now.

To avoid developer to install both Flysystem and AsyncAWS on their local project, we can install these dev dependencies only in the CI (if tests need them).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are dev dependencies already so they are not installed on developper projects. Could be move to suggest only if needed and manually required in test env for CI

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not. We should also not rely on the bundle but just the FlySystem lib

{
private const MIN_PART_SIZE = 5 * 1024 * 1024;

public function __construct(
private S3Client $s3Client,
private string $bucketName,
private readonly LoggerInterface|null $logger = null,
) {
}

public function __invoke(string|null $fileName): \Generator
{
if (null === $fileName) {
$fileName = uniqid('gotenberg_', true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use Filesystem here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be droped soon I think. I don't see a reason filename should be null. I went along with it in this PR but it will change. Otherwise yes filesystem could be a lead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah because of the getFileName() who's returned $this->response->getFileName(). But I think we can't receive a null or it willcrash with an exception before this part.

$this->logger?->debug('{processor}: no filename given. Content will be dumped to "{file}".', ['processor' => self::class, 'file' => $fileName]);
}

$this->logger?->debug('{processor}: starting multi part upload of "{file}".', ['processor' => self::class, 'file' => $fileName]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private readonly LoggerInterface $logger = new NullLogger()

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ? This works well already. What kind of gain would we have ?

$multipart = $this->s3Client->createMultipartUpload([
'Bucket' => $this->bucketName,
'Key' => $fileName,
]);

$uploadId = $multipart->getUploadId();
if (null === $uploadId) {
throw new ProcessorException('Could not initiate a multi part upload');
}

$uploads = [];

$partNumber = 0;
$currentChunk = '';

try {
do {
$chunk = yield;

$currentChunk .= $chunk->getContent();

if (mb_strlen($currentChunk, '8bit') < self::MIN_PART_SIZE) {
continue;
}

++$partNumber;

$this->logger?->debug('{processor}: {min_size_required} reached. Uploading part {upload_part_number}', ['processor' => self::class, 'min_size_required' => self::MIN_PART_SIZE, 'upload_part_number' => $partNumber]);
$upload = $this->s3Client->uploadPart([
'Bucket' => $this->bucketName,
'Key' => $fileName,
'Body' => $currentChunk,
'PartNumber' => $partNumber,
'UploadId' => $uploadId,
]);

$uploads[] = [
'PartNumber' => $partNumber,
'ETag' => $upload->getEtag(),
];

$currentChunk = '';
} while (!$chunk->isLast());

if ('' !== $currentChunk) {
++$partNumber;

$this->logger?->debug('{processor}: last chunk reached. Uploading leftover part {upload_part_number}', ['processor' => self::class, 'upload_part_number' => $partNumber]);
$upload = $this->s3Client->uploadPart([
'Bucket' => $this->bucketName,
'Key' => $fileName,
'Body' => $currentChunk,
'PartNumber' => $partNumber,
'UploadId' => $uploadId,
]);

$uploads[] = [
'PartNumber' => $partNumber,
'ETag' => $upload->getEtag(),
];
}

unset($currentChunk, $upload);

$this->logger?->debug('{processor}: completing multi part upload of "{file}".', ['processor' => self::class, 'file' => $fileName]);

return $this->s3Client->completeMultipartUpload([
'UploadId' => $uploadId,
'Bucket' => $this->bucketName,
'Key' => $fileName,
'MultipartUpload' => [
'Parts' => $uploads,
],
]);
} catch (\Throwable $e) {
$this->s3Client->abortMultipartUpload([
'UploadId' => $uploadId,
'Bucket' => $this->bucketName,
'Key' => $fileName,
]);

throw $e;
}
}
}
8 changes: 3 additions & 5 deletions src/Processor/FileProcessor.php
Original file line number Diff line number Diff line change
Expand Up @@ -25,16 +25,14 @@ public function __invoke(string|null $fileName): \Generator
$this->logger?->debug('{processor}: no filename given. Content will be dumped to "{file}".', ['processor' => self::class, 'file' => $fileName]);
}

$resource = tmpfile() ?: throw new ProcessorException('Unable to create a temporary file resource.');
$tempfileProcessor = (new TempfileProcessor())($fileName);

do {
$chunk = yield;
if (false === fwrite($resource, $chunk->getContent())) {
throw new ProcessorException('Unable to write to the temporary file resource.');
}
$tempfileProcessor->send($chunk);
} while (!$chunk->isLast());

rewind($resource);
$resource = $tempfileProcessor->getReturn();

try {
$path = $this->directory.'/'.$fileName;
Expand Down
49 changes: 49 additions & 0 deletions src/Processor/FlysystemProcessor.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
<?php

namespace Sensiolabs\GotenbergBundle\Processor;

use League\Flysystem\FilesystemOperator;
use Psr\Log\LoggerInterface;
use Sensiolabs\GotenbergBundle\Exception\ProcessorException;

/**
* @implements ProcessorInterface<(Closure(): string)>
*/
final class FlysystemProcessor implements ProcessorInterface
{
public function __construct(
private readonly FilesystemOperator $filesystemOperator,
private readonly LoggerInterface|null $logger = null,
) {
}

public function __invoke(string|null $fileName): \Generator
{
if (null === $fileName) {
$fileName = uniqid('gotenberg_', true);
}

$tmpfileProcessor = (new TempfileProcessor())($fileName);

do {
$chunk = yield;
$tmpfileProcessor->send($chunk);
} while (!$chunk->isLast());

$tmpfile = $tmpfileProcessor->getReturn();

try {
$this->filesystemOperator->writeStream($fileName, $tmpfile);

$this->logger?->debug('{processor}: content dumped to "{file}".', ['processor' => self::class, 'file' => $fileName]);

return function () use ($fileName) {
return $this->filesystemOperator->read($fileName); // use readStream instead ?
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not feel optimised

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does nothing unless you can that callable. I was thinking of returning an anonymous class that implements Stringable and returns this call instead to make it easier to use. What do you suggest ?

} catch (\Throwable $t) {
throw new ProcessorException(\sprintf('Unable to write to "%s".', $fileName), previous : $t);
} finally {
fclose($tmpfile);
}
}
}
27 changes: 27 additions & 0 deletions src/Processor/TempfileProcessor.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
<?php

namespace Sensiolabs\GotenbergBundle\Processor;

use Sensiolabs\GotenbergBundle\Exception\ProcessorException;

/**
* @implements ProcessorInterface<resource>
*/
final class TempfileProcessor implements ProcessorInterface
{
public function __invoke(string|null $fileName): \Generator
{
$resource = tmpfile() ?: throw new ProcessorException('Unable to create a temporary file resource.');

do {
$chunk = yield;
if (false === fwrite($resource, $chunk->getContent())) {
throw new ProcessorException('Unable to write to the temporary file resource.');
}
} while (!$chunk->isLast());

rewind($resource);

return $resource;
}
}
Loading