Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow proxy between Ent Search Connectors and Elasticsearch #2017

Open
gbocchini opened this issue Dec 22, 2023 · 9 comments
Open

Allow proxy between Ent Search Connectors and Elasticsearch #2017

gbocchini opened this issue Dec 22, 2023 · 9 comments

Comments

@gbocchini
Copy link

gbocchini commented Dec 22, 2023

Problem Description

At the present moment, the Elastic connectors (https://github.com/elastic/connectors/blob/8.11/config.yml and https://www.elastic.co/guide/en/enterprise-search/current/connectors.html) do not have an option to declare a proxy for the connection.

Proposed Solution

The change would allow connector to communicate with Elasticsearch via a proxy connection.

Additional Context

In many use cases, due security or just architectural design, all connection must go through a proxy. In the actual scenario, this is impossible for a Ent Search connector limiting it's use cases and usability.

@gbocchini gbocchini added the enhancement New feature or request label Dec 22, 2023
@MatheusGelinskiPires
Copy link

MatheusGelinskiPires commented Dec 22, 2023

@gbocchini , hello!

Maybe we will need to have a configuration for the extraction as well.

For exemple: after having a Connector connected to Elasticsearch via proxy, we need to configure a extractor (Salesforce in my case) and that connection from Connector to Salesforce also needs to be via proxy connection.

Thx!

@seanstory
Copy link
Member

@MatheusGelinskiPires please go ahead and file an enhancement issue for your use case for a salesforce proxy. Unfortunately, each connector typically uses its own transport client implementation, so we'd need to build that type of proxy configuration on a case-by-case basis for each connector. So I'd prefer to not lump that in with this issue.

@seanstory
Copy link
Member

@gbocchini I'd like to better understand why this feature is necessary. The three main ways we envision connectors being used are:

  1. Native Connectors with an Elastic Cloud Elasticsearch (no proxy needed)
  2. Connector Clients with an Elastic Cloud Elasticsearch (no proxy needed)
  3. Connector Clients with Self Managed Elasticsearch (typically located in the same VPC, so typically no proxy needed)

What was the situation where this need came up?

@gbocchini
Copy link
Author

Hello @seanstory! Nice to e-meet! This came from a customer using SalesForce connector. The proxy option would be between the connector and Elastic search. In their scenario (they are a telecom) they are using SalesForce connector and everything must go via Proxy (SF is not on their infra).

Not having the proxy option between the connector and Elastisearch causes the connection to be made outside their proxy, deeming it out of compliance.

Support case 01533799 in case it interests you :)

Thanks!

@seanstory
Copy link
Member

Thanks for the case number, @gbocchini.

@MatheusGelinskiPires , I understand you're the impacted customer! Can you share any more about your use case, and why it doesn't match one of the 3 situations I described above? Is your environment air gapped or something such that it cannot make requests to the outside internet unless through a proxy?

@MatheusGelinskiPires
Copy link

Hello @seanstory,
I need to extract some information from Salesforce which I will use to setup a few Alert Rules.
In this case, if I understood the documentation correctly, I need to use a Client Connector (self managed) for Salesforce.
This Connector will be deployed in our on premises infastructure.
In order to comply with some security rules everything that is deployed on our on premises infastructure and needs to reach internet should do this via proxy.

@MatheusGelinskiPires
Copy link

Hello @seanstory , complementing the information above, as our Elastic environment is a Cloud environment, even the Elasticsearch connection should be via proxy as well.

That could leads to this issue also: elastic/elasticsearch-py#2217

@artem-shelkovnikov
Copy link
Member

More info about other case that I've heard about:

There's a setup in a private virtual network that uses a proxy to reach out to the outside world or other private networks. So connector is unable to reach to the 3rd-party and needs to do interactions over an HTTP proxy with SSL certificate used for the authentication):

[                     Private VNetwork                       ]         [   Internet   ]
[Elasticsearch] <<< >>> [SPO Connector] <<< >>> [ HTTP Proxy ] <<< >>> [ 3rd-party ]

Requirements are to be able to connect to proxy anonymously (for testing) OR with basic auth (for testing too) OR with a certificate.

Here's a POC PR that shows how much effort is needed to implement such logic for Sharepoint Online connector: https://github.com/elastic/connectors/pull/2266/files. This PR, however, does not have SSL Certificate support as I had no time to set up a proxy with SSL Certificate. It would take around an hour to add and test SSL Certificate support when a proxy is set up and available.

@MatheusGelinskiPires
Copy link

There is another case where the Elasticsearch could be running on a Elastic Cloud deployment.

So, the proxy should be also used in this connection as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants