Pii Detection Feature #17
Answered
by
dipanjanb
misterbykl
asked this question in
Q&A
Replies: 3 comments 5 replies
-
Hi Emre,
Quick question - the algorithm you used - does it try to identify patterns
in the value and apply masking or does it check for the field name?
So depending on which approach it takes - it should be able to detect a
field named "SSN" or "Social Security Number" and mask that or detect a
value pattern 999-99-9999 and mask that.
Thanks & Regards,
Dipanjan
…On Thu, 10 Mar 2022 at 19:02, Emre Baykal ***@***.***> wrote:
Hi @dipanjanb <https://github.com/dipanjanb>. I need your assistance on
one subject regarding PII detection. Here is the list of what I integrated
so far
- pii-detection module in Java, including web-server,
stateful-function, type-check functionality.
- modified pii-detection/module.yaml accordingly. All settings are
successfully introduced to the project.
- modified docker-compose.yml accordingly. All settings are
successfully introduced to the project.
- run the test scenario below and observed the expected flow.
Test scenario
1. Test message was sent with Postman to the ingest endpoint (
:8080/ingest)
2. Test message was received on (com.rtdl.sf.pii/pii-detection) with
the type of IncomingMessage (com.rtdl.sf.pii/IncomingMessage)
3. Test message was sent to new Kafka topic "*pii-detection*"
4. Observed that topic was created automatically and message was
produced to that topic - monitored live output using redpanda cli rpk
topic consume pii-detection
Question
1.
Which part of the incoming message need to be masked? I am following
the sample message below.
{
"stream_id":"4f7a24ac-2313-4bdc-9222-f7facefd0fff",
"message_type":"testS3",
"payload":{
"name":"user5",
"array":[4,5,6],
"properties":{"age":46}
}
}
Thanks!
—
Reply to this email directly, view it on GitHub
<#17>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB7GDSZG6XFBKPN6HLYZMA3U7H2WDANCNFSM5QMUXSMQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
Answer selected by
misterbykl
-
Hi again. I managed to fix this by checking all the content. It identifies patterns in the value and applies masking. |
Beta Was this translation helpful? Give feedback.
4 replies
-
No need to check field names. Actually that is a crude and very basic
pattern. My question was intended to find out whether the algo is using a
more brute force approach or a sophisticated solution - your answer points
to the latter. Thanks.
Checking field value using regex is a more robust solution. The reason is
different applications might use different names for a field having the
same implication. Maybe TIN instead of SSN. Having a regex based match
should be able to address both. Additionally - there might be situations
where a developer accidentally populates PII value into a non-PII field. In
those cases also, PII values would get detected.
Request you to please share a brief write-up regarding how someone can use
the Stateful Function that you've created - with RTDL; outline the
objective of the function, the steps to implement the same and also provide
some screenshots to give an idea what people should expect.
Look forward to the document.
…On Mon, 14 Mar 2022 at 14:42, Emre Baykal ***@***.***> wrote:
Hi. It tries to identify patterns and apply masking using regex. My
question was actually for the same reason you pointed out. It does not
check for the field name. What field name/names does it need to check? Is
there a static field name we can predetermine? I'm thinking of setting up
the structure according to these field names. Thanks.
—
Reply to this email directly, view it on GitHub
<#17 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB7GDS3VUSIDS4REZKVCE73U737IXANCNFSM5QMUXSMQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi @dipanjanb. I need your assistance on one subject regarding PII detection. Here is the list of what I integrated so far
pii-detection
module in Java, including web-server, stateful-function, type-check functionality.pii-detection/module.yaml
accordingly. All settings are successfully introduced to the project.docker-compose.yml
accordingly. All settings are successfully introduced to the project.Test scenario
:8080/ingest
)com.rtdl.sf.pii/pii-detection
) with the type of IncomingMessage (com.rtdl.sf.pii/IncomingMessage
)rpk topic consume pii-detection
Question
Which part of the incoming message need to be masked? I am following the sample message below.
Thanks!
You can view the active branch here: https://github.com/misterbykl/rtdl/tree/pii-detection
Beta Was this translation helpful? Give feedback.
All reactions