index.bs

<pre class="metadata">
Title: Self-Review Questionnaire: Security and Privacy
Status: ED
TR: https://www.w3.org/TR/security-privacy-questionnaire/
ED: https://w3ctag.github.io/security-questionnaire/
Shortname: security-privacy-questionnaire
Repository: w3ctag/security-questionnaire
Level: None
Editor: Theresa O’Connor, w3cid 40614, Apple Inc. https://apple.com, hober@apple.com
Editor: Peter Snyder, w3cid 109401, Brave Software https://brave.com, pes@brave.com
Former Editor: Jason Novak, Apple Inc., https://apple.com
Former Editor: Lukasz Olejnik, Independent researcher, https://lukaszolejnik.com
Former Editor: Mike West, Google Inc., mkwst@google.com
Former Editor: Yan Zhu, Yahoo Inc., yan@brave.com
Group: tag
Markup Shorthands: css no, markdown yes
Local Boilerplate: status yes
Local Boilerplate: copyright yes
Abstract: This document contains a set of questions to be used when
    evaluating the security and privacy implications of web platform
    technologies.
</pre>

<h2 id="intro">Introduction</h2>

When designing new features for the Web platform,
we must always consider the security and privacy implications of our work.
New Web features should always
maintain or enhance
the overall security and privacy of the Web.

This document contains a set of questions
intended to help <abbr title="specification">spec</abbr> authors
as they think through
the security and privacy implications
of their work and write the narrative Security Considerations and Privacy
Considerations sections for inclusion in-line in their specifications,
as described below in [[#considerations]].
It also documents mitigation strategies
that spec authors can use to address
security and privacy concerns they encounter as they work on their spec.

This document is itself a work in progress,
and there may be security or privacy concerns
which this document does not (yet) cover.
Please [let us know](https://github.com/w3ctag/security-questionnaire/issues/new)
if you identify a security or privacy concern
this questionnaire should ask about.

<h3 id="howtouse">How To Use The Questionnaire</h3>

Work through these questions
early on in the design process,
when things are easier to change.
When privacy and security issues are only found later,
after a feature has shipped,
it's much harder to change the design.
If security or privacy issues are found late,
user agents may need to adopt breaking changes
to fix the issues.

Keep these questions in mind while working on specifications.
Periodically revisit this questionnaire and continue to consider the questions,
particularly as a design changes over time.

<h3 id=resources>Additional resources</h3>

The Mitigating Browser Fingerprinting in Web Specifications
[[FINGERPRINTING-GUIDANCE]] document published by PING goes into
further depth about browser fingerprinting and should be considered in
parallel with this document.

The IETF's RFC about privacy considerations, [[RFC6973]], is a
wonderful resource, particularly section 7.

<h3 id=reviews>TAG, PING, security reviews and this questionnaire</h3>

Before requesting
privacy and
security reviews from the [Privacy Interest Group
(PING)](https://www.w3.org/Privacy/IG/) and security reviewers,
write "Security Considerations" and
"Privacy Considerations" sections in your document, as described in
[[#considerations]].  Answering the questions in this
document will, we hope, inform your writing of those sections. It is not
appropriate, however, to merely copy this questionnaire into those sections.
Instructions for requesting security and privacy reviews can be
found in the document
<cite>[How to do Wide Review](https://www.w3.org/Guide/documentreview/#how_to_get_horizontal_review)</cite>.

When requesting
a [review](https://github.com/w3ctag/design-reviews)
from the [Technical Architecture Group (TAG)](https://www.w3.org/2001/tag/),
please provide the TAG with answers
to the questions in this document.
[This Markdown
template](https://raw.githubusercontent.com/w3ctag/security-questionnaire/main/questionnaire.markdown)
may be useful when doing so.


<h2 id="questions">Questions to Consider</h2>

<h3 class=question id="purpose">
  What information does this feature expose,
  and for what purposes?
</h3>

User agents should only expose information to the Web
when doing so is necessary to serve a clear user need.
Does your feature expose information to websites?
If so, how does exposing this information benefit the user?
Are the risks to the user outweighed by the benefits to the user?
If so, how?

See also

* [[DESIGN-PRINCIPLES#priority-of-constituencies]]

When answering this question, please consider each of these four possible
areas of information disclosure / sharing.

For the below sub-questions,
please take the term *potentially identifying information*
to mean information that describes the browser user,
distinct from others who use the same browser version.
Examples of such *potentially identifying information* include information
about the browser user's environment (e.g., operating system configuration,
browser configuration, hardware capabilities), and the user's prior activities
and interests (e.g., browsing history, purchasing preferences, personal
characteristics).

1.  What information does your spec expose to the **first party** that
    the **first party** cannot currently easily determine.
2.  What information does your spec expose to **third parties** that
    **third parties** cannot currently easily determine.
3.  What *potentially identifying information* does your spec expose to the
    **first party** that the **first party** can already access (i.e., what
    identifying information does your spec duplicate or mirror).
4.  What *potentially identifying information* does your spec expose to
    **third parties** that **third parties** can already access.

<h3 class=question id="minimum-data">
  Do features in your specification expose the minimum amount of information
  necessary to implement the intended functionality?
</h3>

Features should only expose information
when it's absolutely necessary.
If a feature exposes more information than is necessary,
why does it do so, and can that the same functionality be achieved by
exposing less information?

See also

* [[#data-minimization]]

<p class=example>
Content Security Policy [[CSP]] unintentionally exposed redirect targets
cross-origin by allowing one origin to infer details about another origin
through violation reports (see [[HOMAKOV]]). The working group eventually
mitigated the risk by reducing a policy's granularity after a redirect.
</p>

<h3 class=question id="personal-data">
  Do the features in your specification expose personal information,
  personally-identifiable information (PII), or information derived from
  either?
</h3>

Personal information is any data about a user
(for example, their home address),
or information that could be used to identify a user,
such as an alias, email address, or identification number.

Note: Personal information is
distinct from personally identifiable information
(<abbr title="personally identifiable information">PII</abbr>).
PII is a legal concept,
the definition of which varies from jurisdiction to jurisdiction.
When used in a non-legal context,
PII tends to refer generally
to information
that could be used to identify a user.

When exposing
personal information, PII, or derivative information,
specification authors must prevent or, when prevention is not possible, minimize
potential harm to users.

<p class=example>
A feature
which gathers biometric data
(such as fingerprints or retina scans)
for authentication
should not directly expose this biometric data to the web.
Instead,
it can use the biometric data
to look up or generate some temporary key which is not shared across origins
which can then be safely exposed to the origin. [[WEBAUTHN]]
</p>

Personal information, PII, or their derivatives
should not be exposed to origins
without [meaningful user consent](https://w3ctag.github.io/design-principles/#consent).
Many APIs
use the Permissions API to acquire meaningful user consent.
[[PERMISSIONS]]

Keep in mind
that each permission prompt
added to the web platform
increases the risk
that users will ignore
the contents of all permission prompts.
Before adding a permission prompt, consider your options for using
a less obtrusive way to gain meaningful user consent.
[[ADDING-PERMISSION]]

<p class=example>
`<input type=file>` can be used to upload
documents containing personal information
to websites.
It makes use of
the underlying native platform's file picker
to ensure the user understands
that the file and its contents
will be exposed to the website,
without a separate permissions prompt.
</p>

See also

* [[#user-mediation]]
* [[DESIGN-PRINCIPLES#consent]]

<h3 class=question id="sensitive-data">
  How do the features in your specification deal with sensitive information?
</h3>

Personal information is not the only kind of sensitive information.
Many other kinds of information may also be sensitive.
What is or isn't sensitive information can vary
from person to person
or from place to place.
Information that would be harmless if known about
one person or group of people
could be dangerous if known about
another person or group.
Information about a person
that would be harmless in one country
might be used in another country
to detain, kidnap, or imprison them.

Examples of sensitive information include:
caste,
citizenship,
color,
credentials,
criminal record,
demographic information,
disability status,
employment status,
ethnicity,
financial information,
health information,
location data,
marital status,
political beliefs,
profession,
race,
religious beliefs or nonbeliefs,
sexual preferences,
and
trans status.

When a feature exposes sensitive information to the web,
its designers must take steps
to mitigate the risk of exposing the information.

<div class=example>

The Credential Management API allows sites
to request a user's credentials
from a password manager. [[CREDENTIAL-MANAGEMENT-1]]
If it exposed the user's credentials to JavaScript,
and if the page using the API were vulnerable to [=XSS=] attacks,
the user's credentials could be leaked to attackers.

The Credential Management API
mitigates this risk
by not exposing the credentials to JavaScript.
Instead, it exposes
an opaque {{FormData}} object
which cannot be read by JavaScript.
The spec also recommends
that sites configure Content Security Policy [[CSP]]
with reasonable [=connect-src=] and [=form-action=] values
to further mitigate the risk of exfiltration.

</div>

Many use cases
which require location information
can be adequately served
with very coarse location data.
For instance,
a site which recommends restaurants
could adequately serve its users
with city-level location information
instead of exposing the user's precise location.

See also

* [[DESIGN-PRINCIPLES#do-not-expose-use-of-assistive-tech]]

<h3 class=question id=hidden-data>
  Does data exposed by your specification carry related but distinct
  information that may not be obvious to users?
</h3>

Features which enable users
to share data with origins
should ensure that such data
does not carry embedded, possibly hidden, information
without the user's awareness, understanding, and consent.

Documents
such as image or video files
often contain metadata about
where and when the image, video, or audio was captured
and
what kind of device captured or produced the data.
When uploaded,
this kind of metadata
may reveal to origins
information the user did not intend to reveal,
such as the user's present or past location
and socioeconomic status.

User agents should enable users to choose
whether or not to share such data with sites,
and the default should be that such data
is not shared.

<h3 class=question id="persistent-origin-specific-state">
  Do the features in your specification introduce state
  that persists across browsing sessions?
</h3>

The Web platform already includes many mechanisms
origins can use to
store information.
Cookies,
`ETag`,
`Last Modified`,
{{localStorage}},
and
{{indexedDB}},
are just a few examples.

Allowing a website
to store data
on a user’s device
in a way that persists across browsing sessions
introduces the risk
that this state may be used
to track a user
without their knowledge or control,
either in [=first-party-site context|first-=] or [=third-party context|third-party=] contexts.

One way
user agents prevent origins from
abusing client-side storage mechanisms
is by providing users with the ability
to clear data stored by origins.
Specification authors should include similar
protections to make sure that new
client-side storage mechanisms
cannot be misused to track users across domains
without their control.
However, just giving users the ability
to delete origin-set state is usually
not sufficient since users rarely
manually clear browser state.
Spec authors should consider ways
to make new features more privacy-preserving without full storage clearing,
such as
reducing the uniqueness of values,
rotating values,
or otherwise making features no more identifying than is needed.
<!-- https://github.com/w3ctag/design-principles/issues/215 -->

Additionally, specification authors
should carefully consider and specify, when possible,
how their features should interact with browser caching
features. Additional mitigations may be necessary to
prevent origins from abusing caches to
identify and track users across sites or sessions without user consent.

<p class=example>
Platform-specific DRM implementations
(such as [=content decryption modules=] in [[ENCRYPTED-MEDIA]])
might expose origin-specific information
in order to help identify users
and determine whether they ought to be granted access
to a specific piece of media.
These kinds of identifiers
should be carefully evaluated
to determine how abuse can be mitigated;
identifiers which a user cannot easily change
are very valuable from a tracking perspective,
and protecting such identifiers
from an [=active network attacker=]
is vital.
</p>

<h3 class=question id="underlying-platform-data">
  Do the features in your specification expose information about the
  underlying platform to origins?
</h3>


(Underlying platform information includes
user configuration data,
the presence and attributes of hardware I/O devices such as sensors,
and the availability and behavior of various software features.)

If so, is the same information exposed across origins?
Do different origins see different data or the same data?
Does the data change frequently or rarely?
Rarely-changing data exposed to multiple origins
can be used to uniquely identify a user across those origins.
This may be direct
(when the piece of information is unique)
or indirect
(because the data may be combined with other data to form a fingerprint). [[FINGERPRINTING-GUIDANCE]]

When considering whether or not to expose such information,
specs and user agents
should not consider the information in isolation,
but should evaluate the risk of adding it
to the existing fingerprinting surface of the platform.

Keep in mind that
the fingerprinting risk of a particular piece of information
may vary between platforms.
The fingerprinting risk of some data
on the hardware and software platforms *you* use
may be different than
the fingerprinting risk on other platforms.

When you do decide to expose such information,
you should take steps to mitigate the harm of such exposure.

Sometimes the right answer is to not expose the data in the first place (see [[#drop-feature]]).
In other cases,
reducing fingerprintability may be as simple as
ensuring consistency—for instance,
by ordering a list of available resources—but sometimes,
more complex mitigations may be necessary.
See [[#mitigations]] for more.

If features in your spec expose such data
and does not define adequate mitigations,
you should ensure that such information
is not revealed to origins
without [[DESIGN-PRINCIPLES#consent|meaningful user consent]],
and
you should clearly describe this
in your specification's Security and Privacy Considerations sections.

<p class=example>
WebGL's `RENDERER` string
enables some applications to improve performance.
It's also valuable fingerprinting data.
This privacy risk must be carefully weighed
when considering exposing such data to origins.
</p>

<p class=example>
The [=PDF viewer plugin objects=] list almost never changes.
Some user agents have [disabled direct enumeration of the plugin list](https://bugzilla.mozilla.org/show_bug.cgi?id=757726)
to reduce the fingerprinting harm of this interface.
</p>

See also:

* [[DESIGN-PRINCIPLES#device-ids|Use care when exposing identifying information about devices]]
* [[DESIGN-PRINCIPLES#device-enumeration|Use care when exposing APIs for selecting or enumerating devices]]

<h3 class=question id=send-to-platform>
  Does this specification allow an origin to send data to the underlying
  platform?
</h3>

If so, what kind of data can be sent?

Platforms differ in how they process data passed into them,
which may present different risks to users.

Don't assume the underlying platform will safely handle the data that is passed.
Where possible, mitigate attacks by limiting or structuring the kind of data is passed to the platform.

<div class=example>
URLs may or may not be dereferenced by a platform API,
and if they are dereferenced,
redirects may or may not be followed.
If your specification sends URLs to underlying platform APIs,
the potential harm of *your* API
may vary depending on
the behavior of the various underlying platform APIs it's built upon.

What happens when `file:`, `data:`, or `blob:` URLs
are passed to the underlying platform API?
These can potentially read sensitive data
directly form the user's hard disk or from memory.

Even if your API only allows `http:` and `https:` URLs,
such URLs may be vulnerable to [=CSRF=] attacks,
or be redirected to `file:`, `data:`, or `blob:` URLs.
</div>

<h3 class=question id="sensor-data">
  Do features in this specification enable access to device sensors?
</h3>

If so, what kinds of information from or about the sensors are exposed to origins?

Information from sensors may serve as a fingerprinting vector across origins.
Additionally,
sensors may reveal something sensitive about the device or its environment.

If sensor data is relatively stable
and consistent across origins,
it could be used as a cross-origin identifier.
If two User Agents expose such stable data from the same sensors,
the data could even be used as a cross-browser, or potentially even a cross-device, identifier.

<p class=example>
Researchers discovered that
it's possible to use
a sufficiently fine-grained gyroscope
as a microphone [[GYROSPEECHRECOGNITION]].
This can be mitigated by lowering the gyroscope's sample rates.
</p>

<p class=example>
Ambient light sensors could allow an attacker to learn whether or not a
user had visited given links [[OLEJNIK-ALS]].
</p>

<p class=example>
Even relatively short lived data, like the battery status, may be able to
serve as an identifier [[OLEJNIK-BATTERY]].
</p>

<h3 class=question id="string-to-script">
  Do features in this specification enable new script execution/loading
  mechanisms?
</h3>

New mechanisms for executing or loading scripts have a risk of enabling novel attack surfaces.
Generally, if a new feature needs this you should consult with a wider audience,
and think about whether or not an existing mechanism can be used
or the feature is really necessary.

<!-- NOTE: This is still experimental technology, and we might have to provide a different example if this experiment fails. -->

<p class=example>
<a href="https://github.com/whatwg/html/issues/4315">JSON modules</a> are expected to be treated only as data,
but the initial proposal allowed an adversary to swap it out with code without the user knowing.
<a href="https://github.com/tc39/proposal-import-assertions">Import assertions</a> were implemented
as a mitigation for this vulnerability.
</p>

<h3 class=question id="remote-device">
  Do features in this specification allow an origin to access other devices?
</h3>

If so, what devices do the features in this specification allow an origin to
access?

Accessing other devices, both via network connections and via
direct connection to the user's machine (e.g. via Bluetooth,
NFC, or USB), could expose vulnerabilities - some of
these devices were not created with web connectivity in mind and may be inadequately
hardened against malicious input, or with the use on the web.

Exposing other devices on a user’s local network also has significant privacy
risk:

* If two user agents have the same devices on their local network, an
    attacker may infer that the two user agents are running on the same host
    or are being used by two separate users who are in the same physical
    location.
* Enumerating the devices on a user’s local network provides significant
    entropy that an attacker may use to fingerprint the user agent.
* If features in this spec expose persistent or long lived identifiers of
    local network devices, that provides attackers with a way to track a user
    over time even if a user takes steps to prevent such tracking (e.g.
    clearing cookies and other stateful tracking mechanisms).
* Direct connections might be also be used to bypass security checks that
    other APIs would provide. For example, attackers used the WebUSB API to
    access others sites' credentials on a hardware security, bypassing
    same-origin checks in an early U2F API. [[YUBIKEY-ATTACK]]

<p class=example>
The Network Service Discovery API [[DISCOVERY-API]] recommended CORS
preflights before granting access to a device, and requires user agents to
involve the user with a permission request of some kind.
</p>

<p class=example>
Likewise, the Web Bluetooth [[WEB-BLUETOOTH]] has an extensive discussion of
such issues in [[WEB-BLUETOOTH#privacy]], which is worth
reading as an example for similar work.
</p>

<p class=example>
[[WEBUSB]] addresses these risks through a combination of user mediation /
prompting, secure origins, and feature policy.
See [[WEBUSB#security-and-privacy]] for more.
</p>

<h3 class=question id="native-ui">
  Do features in this specification allow an origin some measure of control over
  a user agent's native UI?
</h3>

Features that allow for control over a user agent’s UI (e.g. full screen
mode) or changes to the underlying system (e.g. installing an ‘app’ on a
smartphone home screen) may surprise users or obscure security / privacy
controls.  To the extent that your feature does allow for the changing of a
user agent’s UI, can it effect security / privacy controls?  What analysis
confirmed this conclusion?

<h3 class=question id="temporary-id">
  What temporary identifiers do the features in this specification create or
  expose to the web?
</h3>

If a standard exposes a temporary identifier to the web, the identifier
should be short lived and should rotate on some regular duration to mitigate
the risk of this identifier being used to track a user over time.  When a
user clears state in their user agent, these temporary identifiers should be
cleared to prevent re-correlation of state using a temporary identifier.

If features in this spec create or expose temporary identifiers to the
web, how are they exposed, when, to what entities, and, how frequently are
those temporary identifiers rotated?

Example temporary identifiers include TLS Channel ID, Session Tickets, and
IPv6 addresses.

<p class=example>
The index attribute in the Gamepad API [[GAMEPAD]] — an integer that starts
at zero, increments, and is reset — is a good example of a privacy friendly
temporary identifier.
</p>


<h3 class=question id="first-third-party">
  How does this specification distinguish between behavior in first-party and
  third-party contexts?
</h3>

The behavior of a feature should be considered not just in the context of its
being used by a first party origin that a user is visiting but also the
implications of its being used by an arbitrary third party that the first
party includes. When developing your specification, consider the implications
of its use by third party resources on a page and, consider if support for
use by third party resources should be optional to conform to the
specification.  If supporting use by third party resources is mandatory for
conformance, please explain why and what privacy mitigations are in place.
This is particularly important as user agents may take steps to reduce the
availability or functionality of certain features to third parties if the
third parties are found to be abusing the functionality.

<h3 class=question id="private-browsing">
  How do the features in this specification work in the context of a browser’s
  Private Browsing or Incognito mode?
</h3>

Most browsers implement a private browsing or incognito mode,
though they vary significantly in what functionality they provide and
how that protection is described to users [[WU-PRIVATE-BROWSING]].

One commonality is that they provide a different set of state
than the browser's 'normal' state.

Do features in this spec provide information that would allow for the
correlation of a single user's activity across normal and private
browsing / incognito modes?  Do features in the spec result in
information being written to a user’s host that would persist
following a private browsing / incognito mode session ending?

There has been research into both:

* Detecting whether a user agent is in private browsing mode [[RIVERA]]
    using non-standardized methods such as <code>[window.requestFileSystem()](https://developer.mozilla.org/en-US/docs/Web/API/Window/requestFileSystem)</code>.
* Using features to fingerprint a browser and correlate private and
    non-private mode sessions for a given user. [[OLEJNIK-PAYMENTS]]

Spec authors should avoid, as much as possible, making the presence of
private browsing mode detectable to sites. [[DESIGN-PRINCIPLES#do-not-expose-use-of-private-browsing-mode]]

<h3 class=question id="considerations">
  Does this specification have both "Security Considerations" and "Privacy
  Considerations" sections?
</h3>

Specifications should have both "Security Considerations" and "Privacy
Considerations" sections to help implementers and web developers
understand the risks that a feature presents and to ensure that
adequate mitigations are in place.  While your answers to the
questions in this document will inform your writing of those sections,
do not merely copy this questionnaire into those sections.  Instead,
craft language specific to your specification that will be helpful to
implementers and web developers.

[[RFC6973]] is an excellent resource to consult when considering
privacy impacts of your specification, particularly Section 7 of
RFC6973.  [[RFC3552]] provides general advice as to writing Security
Consideration sections, and Section 5 of RFC3552 has specific requirements.

Generally, these sections should contain clear descriptions of the
privacy and security risks for the features your spec introduces.  It is also
appropriate to document risks that are mitigated elsewhere in the
specification and to call out details that, if implemented
other-than-according-to-spec, are likely to lead to vulnerabilities.

If it seems like none of the features in your specification have security or
privacy impacts, say so in-line, e.g.:

> There are no known security impacts of the features in this specification.

Be aware, though, that most specifications include features that have at least some
impact on the fingerprinting surface of the browser.  If you believe
your specification in an outlier, justifying that claim is in
order.

<h3 class=question id="relaxed-sop">
  Do features in your specification enable origins to downgrade default
  security protections?
</h3>

Do features in your spec
enable an origin to opt-out of security settings
in order to accomplish something?
If so,
in what situations do these features allow such downgrading, and why?

Can this be avoided in the first place?
If not, are mitigations in place
to make sure this downgrading doesn’t dramatically increase risk to users?
For instance,
[[PERMISSIONS-POLICY]] defines a mechanism
that can be used by sites to prevent untrusted <{iframe}>s from using such a feature.

<div class=example>
The {{Document/domain|document.domain}} setter can be used to relax the [=same-origin policy=].
The most effective mitigation
would be to remove it from the platform (see [[#drop-feature]]),
though that
[may be challenging](https://github.com/mikewest/deprecating-document-domain/)
for compatibility reasons.
</div>

<div class=example>
The Fullscreen API enables
a (portion of a) web page
to expand to fill the display. [[FULLSCREEN]]
This can hide
several User Agent user interface elements
which help users to understand
what web page they are visiting
and whether or not the User Agent believes they are [safe](https://w3ctag.github.io/design-principles/#safe-to-browse).

Several mitigations are defined in the specification
and are widely deployed in implementations.
For instance, the Fullscreen API is a [=policy-controlled feature=],
which enables sites to disable the API in <{iframe}>s.
[[FULLSCREEN#security-and-privacy-considerations]] encourages implementations
to display an overlay which informs the user that they have entered fullscreen,
and to advertise a simple mechanism to exit fullscreen (typically the `Esc` key).
</div>

<h3 class=question id="bfcache">
  What happens when a document that uses your feature is kept alive in BFCache
  (instead of getting destroyed) after navigation, and potentially gets reused
  on future navigations back to the document?
</h3>

After a user navigates away from a document,
the document might stay around in a non-"[=Document/fully active=]" state
and kept in the "back/forward cache (BFCache)",
and might be reused when the user navigates back to the document.
From the user’s perspective,
the non-[=Document/fully active=] document is already discarded
and thus should not get updates/events that happen after they navigated away from it,
especially privacy-sensitive information (e.g. geolocation).

Also, as a document might be reused even after navigation,
be aware that tying something to a document’s lifetime
also means reusing it after navigations.
If this is not desirable,
consider listening to changes to the [=Document/fully active=] state
and doing cleanup as necessary.

For more detailed guidance on how to handle BFCached documents,
see [[DESIGN-PRINCIPLES#non-fully-active]] and the [Supporting BFCached Documents](https://w3ctag.github.io/bfcache-guide/) guide.

Note: It is possible for a document to become non-[=Document/fully active=] for other reasons not related to BFcaching,
such as when the iframe holding the document [=becomes disconnected=].
Our advice is that all non-[=Document/fully active=] documents should be treated the same way.
The only difference is that BFCached documents might become [=Document/fully active=] again,
whereas documents in detached iframes will stay inactive forever.
Thus, we suggest paying extra attention to the BFCache case.

<div class=example>
Screen WakeLock API [releases the wake lock](https://w3c.github.io/screen-wake-lock/#handling-document-loss-of-full-activity)
when a document becomes no longer fully active.
</div>
<div class=example>
[=Sticky activation=] is determined by the "last activation timestamp",
which is tied to a document.
This means after a user triggers activation once on a document,
the document will have sticky activation forever,
even after the user navigated away and back to it again.
</div>

<h3 class=question id="non-fully-active">
  What happens when a document that uses your feature gets disconnected?
</h3>
If the iframe element containing a document [=becomes disconnected=],
the document will no longer be [=Document/fully active=].
The document will never become fully active again,
because if the iframe element [=becomes connected=] again, it will load a new document.
The document is gone from the user's perspective,
and should be treated as such by your feature as well.
You may follow the guidelines for <a href="bfcache">BFCache</a> mentioned above,
as we expect BFCached and detached documents to be treated the same way,
with the only difference being that BFCached documents can become [=Document/fully active=] again.

<h3 id="error-handling">
  Does your spec define when and how new kinds of errors should be raised?
</h3>

Error handling,
and what conditions constitute error states,
can be the source of unintended information leaks and privacy vulnerabilities.
Triggering an error,
what information is included with (or learnable by) the error,
and which parties in an application can learn about the error can all
effect (or weaken) user privacy.
Proposal authors should carefully think
through each of these dimensions to ensure that user privacy and security are
not harmed through error handling.

A partial list of how error definitions and error handling can put
users at risk include:

- If your spec defines an error state based whether certain system resources
    are available,
    applications can use that error state as a probe to learn
    about the availability of those system resources.
    This can harm user privacy
    when user agents do not intend for applications to learn about those system
    resources.
- Specs often include information with error objects that are intended to help
    authors identify and debug issues in applications.
    Spec authors should
    carefully think through what information such debugging information exposes,
    and whether (and which) actors on a page are able to access that information.

<h3 class=question id="accessibility-devices">
  Does your feature allow sites to learn about the users use of assistive technology?
</h3>
The Web is designed to work for everyone, and Web standards should be designed
for people using assistive technology (<abbr title="assistive technology">AT</abbr>) just as much as for users relying
on mice, keyboards, and touch screens. Accessibility and universal access
are core to the W3C's mission.

Specification authors though should keep in mind that Web users that rely on
assistive technology face some unique risks when using the Web.
The use of assistive technologies may cause those Web users to stand
out among other Web users, increasing the risk of unwanted reidentification
and privacy harm. Similarly, some Web site operators may try to
discriminate against Web users who rely on assistive technology.

Feature designers and <abbr title=specification>spec</abbr> authors should therefore be thoughtful and
careful to limit if, and what, websites can learn about the use of assistive
technologies. <abbr>Spec</abbr> authors must minimize both what information about
assistive technology use their features reveal, both explicitly
and implicitly. Examples of <em>explicit</em> information about assistive technology
include device identifiers or model names. Examples of <em>implicit</em>
information about the use of assistive technology might include
user interaction patterns that are unlikely to be generated by a
mouse, keyboard, or touch screen.

<p class=example>
The [[wai-aria-1.3]] defines additional markup authors can use to make
their pages easier to navigate with assistive technology. The <abbr>spec</abbr>
includes the [`aria-hidden`](https://w3c.github.io/aria/#aria-hidden)
attribute, that site authors can use to indicate that certain content
should be hidden from assistive technology.

A malicious site author might
abuse the `aria-hidden` attribute to learn if a user is using assistive
technology, possibly by revealing certain page content to assistive technology,
while showing very different page content to other users. A malicious
site author could then possibly infer from the user's behavior which
content the user was interacting with, and so whether assistive technology
was being used.
</p>

<h3 class=question id="missing-questions">
  What should this questionnaire have asked?
</h3>

This questionnaire is not exhaustive.
After completing a privacy review,
it may be that
there are privacy aspects of your specification
that a strict reading, and response to, this questionnaire,
would not have revealed.
If this is the case,
please convey those privacy concerns,
and indicate if you can think of improved or new questions
that would have covered this aspect.

Please consider [filing an issue](https://github.com/w3ctag/security-questionnaire/issues/new)
to let us know what the questionnaire should have asked.

<h2 id="threats">Threat Models</h2>

To consider security and privacy it is convenient to think in terms of threat
models, a way to illuminate the possible risks.

There are some concrete privacy concerns that should be considered when
developing a feature for the web platform [[RFC6973]]:

* Surveillance: Surveillance is the observation or monitoring of an
    individual's communications or activities.
* Stored Data Compromise: End systems that do not take adequate measures to
    secure stored data from unauthorized or inappropriate access.
* Intrusion: Intrusion consists of invasive acts that disturb or interrupt
    one's life or activities.
* Misattribution: Misattribution occurs when data or communications related
    to one individual are attributed to another.
* Correlation: Correlation is the combination of various pieces of
    information related to an individual or that obtain that characteristic
    when combined.
* Identification: Identification is the linking of information to a
    particular individual to infer an individual's identity or to allow the
    inference of an individual's identity.
* Secondary Use: Secondary use is the use of collected information about an
    individual without the individual's consent for a purpose different from
    that for which the information was collected.
* Disclosure: Disclosure is the revelation of information about an
    individual that affects the way others judge the individual.
* Exclusion: Exclusion is the failure to allow individuals to know about
    the data that others have about them and to participate in its handling
    and use.

In the mitigations section, this document outlines a number of techniques
that can be applied to mitigate these risks.

Enumerated below are some broad classes of threats that should be
considered when developing a web feature.

<h3 id="passive-network">
  Passive Network Attackers
</h3>

A <dfn>passive network attacker</dfn> has read-access to the bits going over
the wire between users and the servers they're communicating with. She can't
*modify* the bytes, but she can collect and analyze them.

Due to the decentralized nature of the internet, and the general level of
interest in user activity, it's reasonable to assume that practically every
unencrypted bit that's bouncing around the network of proxies, routers, and
servers you're using right now is being read by someone. It's equally likely
that some of these attackers are doing their best to understand the encrypted
bits as well, including storing encrypted communications for later
cryptanalysis (though that requires significantly more effort).

* The IETF's "Pervasive Monitoring Is an Attack" document [[RFC7258]] is
    useful reading, outlining some of the impacts on privacy that this
    assumption entails.

* Governments aren't the only concern; your local coffee shop is likely to
    be gathering information on its customers, your ISP at home is likely to
    be doing the same.

<h3 id="active-network">
  Active Network Attackers
</h3>

An <dfn>active network attacker</dfn> has both read- and write-access to the
bits going over the wire between users and the servers they're communicating
with. She can collect and analyze data, but also modify it in-flight,
injecting and manipulating Javascript, HTML, and other content at will.
This is more common than you might expect, for both benign and malicious
purposes:

* ISPs and caching proxies regularly cache and compress images before
    delivering them to users in an effort to reduce data usage. This can be
    especially useful for users on low-bandwidth, high-latency devices like
    phones.

* ISPs also regularly inject JavaScript [[COMCAST]] and other identifiers
    [[VERIZON]] for less benign purposes.

* If your ISP is willing to modify substantial amounts of traffic flowing
    through it for profit, it's difficult to believe that state-level
    attackers will remain passive.

<h3 id="sop-violations">
  Same-Origin Policy Violations
</h3>

The <dfn>same-origin policy</dfn> is the cornerstone of security on the web;
one origin should not have direct access to another origin's data (the policy
is more formally defined in Section 3 of [[RFC6454]]). A corollary to this
policy is that an origin should not have direct access to data that isn't
associated with *any* origin: the contents of a user's hard drive,
for instance. Various kinds of attacks bypass this protection in one way or
another. For example:

* <dfn local-lt="XSS">Cross-site scripting attacks</dfn> involve an
    attacker tricking an origin into executing attacker-controlled code in
    the context of a target origin.

* <dfn local-lt="CSRF">Cross-site request forgery attacks</dfn> trick user
    agents into exerting a user's ambient authority on sites where they've
    logged in by submitting requests on their behalf.

* Data leakage occurs when bits of information are inadvertently made
    available cross-origin, either explicitly via CORS headers [[CORS]],
    or implicitly, via side-channel attacks like [[TIMING]].

<h3 id="third-party-tracking">
  Third-Party Tracking
</h3>

Part of the power of the web is its ability for a page to pull in content
from other third parties — from images to javascript — to enhance the content
and/or a user's experience of the site.  However, when a page pulls in
content from third parities, it inherently leaks some information to third
parties — referer information and other information that may be used to track
and profile a user.  This includes the fact that cookies go back to the
domain that initially stored them allowing for cross origin tracking.
Moreover, third parties can gain execution power through third party
Javascript being included by a webpage.  While pages can take steps to
mitigate the risks of third party content and browsers may differentiate
how they treat first and third party content from a given page, the risk of
new functionality being executed by third parties rather than the first party
site should be considered in the feature development process.

The simplest example is injecting a link to a site that behaves differently
under specific condition, for example based on the fact that user is or is not
logged to the site. This may reveal that the user has an account on a site.

<h3 id="legitimate-misuse">
  Legitimate Misuse
</h3>

Even when powerful features are made available to developers, it does not
mean that all the uses should always be a good idea, or justified; in fact,
data privacy regulations around the world may even put limits on certain uses
of data. In the context of first party, a legitimate website is potentially
able to interact with powerful features to learn about user behavior or
habits. For example:

* Tracking the user while browsing the website via mechanisms such as mouse
    move tracking

* Behavioral profiling of the user based on the usage patterns

* Accessing powerful features that enable the first-party to learn about
    the user's system, the user themselves, or the user's susurroundings, such
    as could be done through a webcam or sensors

This point is admittedly different from others - and underlines that even if
something may be possible, it does not mean it should always be done,
including the need for considering a privacy impact assessment or even an
ethical assessment. When designing features with security and privacy
in mind, all both use and misuse cases should be in scope.

<h2 id="mitigations">
  Mitigation Strategies
</h2>

To mitigate the security and privacy risks you’ve identified in your
specification,
you may want to apply one or more of the mitigations described below.

<h3 id="data-minimization">
  Data Minimization
</h3>

Minimization is a strategy that involves exposing as little information to
other communication partners as is required for a given operation to
complete. More specifically, it requires not providing access to more
information than was apparent in the user-mediated access or allowing the
user some control over which information exactly is provided.

For example, if the user has provided access to a given file, the object
representing that should not make it possible to obtain information about
that file's parent directory and its contents as that is clearly not what is
expected.

In context of data minimization it is natural to ask what data is passed
around between the different parties, how persistent the data items and
identifiers are, and whether there are correlation possibilities between
different protocol runs.

For example, the W3C Device APIs Working Group has defined a number of
requirements in their Privacy Requirements document. [[DAP-PRIVACY-REQS]]

Data minimization is applicable to specification authors and implementers, as
well as to those deploying the final service.

As an example, consider mouse events. When a page is loaded, the application
has no way of knowing whether a mouse is attached, what type of mouse it is
(e.g., make and model), what kind of capabilities it exposes, how many are
attached, and so on. Only when the user decides to use the mouse — presumably
because it is required for interaction — does some of this information become
available. And even then, only a minimum of information is exposed: you could
not know whether it is a trackpad for instance, and the fact that it may have
a right button is only exposed if it is used. For instance, the Gamepad API
makes use of this data minimization capability. It is impossible for a Web game
to know if the user agent has access to gamepads, how many there are, what
their capabilities are, etc. It is simply assumed that if the user wishes to
interact with the game through the gamepad then she will know when to action
it — and actioning it will provide the application with all the information
that it needs to operate (but no more than that).

The way in which the functionality is supported for the mouse is simply by
only providing information on the mouse's behaviour when certain events take
place. The approach is therefore to expose event handling (e.g., triggering
on click, move, button press) as the sole interface to the device.

Two specifications that have minimized the data their features expose
are:

* [[BATTERY-STATUS]] <q>The user agent should not expose high precision readouts</q>
* [[GENERIC-SENSOR]] <q>Limit maximum sampling frequency</q>,
    <q>Reduce accuracy</q></em>

<h3 id="privacy-friendly-defaults">
  Default Privacy Settings
</h3>

Users often do not change defaults, as a result, it is important that the
default mode of a specification minimizes the amount, identifiability, and
persistence of the data and identifiers exposed.  This is particularly true
if a protocol comes with flexible options so that it can be tailored to
specific environments.

<h3 id="user-mediation">
  Explicit user mediation
</h3>

If the security or privacy risk of a feature cannot otherwise be mitigated in
a specification, optionally allowing an implementer to prompt a user may
be the best mitigation possible, understanding it does not entirely remove
the privacy risk.  If the specification does not allow for the implementer to
prompt, it may result in divergence implementations by different user agents
as some user agents choose to implement more privacy-friendly version.

It is possible that the risk of a feature cannot be mitigated because the
risk is endemic to the feature itself.  For instance, [[GEOLOCATION]]
reveals a user’s location intentionally; user agents generally gate access to
the feature on a permission prompt which the user may choose to accept.  This
risk is also present and should be accounted for in features that expose
personal data or identifiers.

Designing such prompts is difficult as is determining the duration that the
permission should provide.

Often, the best prompt is one that is clearly tied to a user action, like the
file picker, where in response to a user action, the file picker is brought
up and a user gives access to a specific file to an individual site.

Generally speaking, the duration and timing of the prompt should be inversely
proportional to the risk posed by the data exposed.  In addition, the prompt
should consider issues such as:

* How should permission requests be scoped? Especially when requested by an
    embedded third party iframe?
* Should persistence be based on the pair of top-level/embedded origins or a
    different scope?
* How is it certain that the prompt is occurring in context of requiring the
    data and at a time that it is clear to the user why the prompt is occurring.
* Explaining the implications of permission before prompting the user, in a
    way that is accessible and localized -- _who_ is asking, _what_ are they
    asking for, _why_ do they need it?
* What happens if the user rejects the request at the time of the prompt or
    if the user later changes their mind and revokes access.

These prompts should also include considerations for what, if any, control a
user has over their data after it has been shared with other parties.  For
example, are users able to determine what information was shared with other
parties?

<h3 id="restrict-to-first-party">
  Explicitly restrict the feature to first party origins
</h3>

As described in the "Third-Party Tracking" section, web pages mix
first and third party content into a single application, which
introduces the risk that third party content can misuse the same set of web
features as first party content.

Authors should explicitly specify a feature's scope of availability:

* When a feature should be made available to embedded third parties -- and
    often first parties should be able to explicitly control that (using
    iframe attributes or feature policy)
* Whether a feature should be available in the background or only in the
    top-most, visible tab.
* Whether a feature should be available to offline service workers.
* Whether events will be fired simultaneously

Third party access to a feature should be an optional implementation for
conformance.

<h3 id="secure-contexts">
  Secure Contexts
</h3>

If the primary risk that you’ve identified in your specification is the
threat posed by [=active network attacker=], offering a feature to an
insecure origin is the same as offering that feature to every origin because
the attacker can inject frames and code at will. Requiring an encrypted and
authenticated connection in order to use a feature can mitigate this kind of
risk.

Secure contexts also protect against [=passive network attackers=].  For
example, if a page uses the Geolocation API and sends the sensor-provided
latitude and longitude back to the server over an insecure connection, then
any passive network attacker can learn the user's location, without any
feasible path to detection by the user or others.

However, requiring a secure context is not sufficient to mitigate many
privacy risks or even security risks from other threat actors than active
network attackers.

<h3 id="drop-feature">
  Drop the feature
</h3>

Possibly the simplest way
to mitigate potential negative security or privacy impacts of a feature
is to drop the feature,
though you should keep in mind that some security or privacy risks
may be removed or mitigated
by adding features to the platform.
Every feature in a specification
should be seen as
potentially adding security and/or privacy risk
until proven otherwise.
Discussing dropping the feature
as a mitigation for security or privacy impacts
is a helpful exercise
as it helps illuminate the tradeoffs
between the feature,
whether it is exposing the minimum amount of data necessary,
and other possible mitigations.

Consider also the cumulative effect
of feature addition
to the overall impression that users have
that [it is safe to visit a web page](https://w3ctag.github.io/design-principles/#safe-to-browse).
Doing things that complicate users' understanding
that it is safe to visit websites,
or that complicate what users need to understand
about the safety of the web
(e.g., adding features that are less safe)
reduces the ability of users
to act based on that understanding of safety,
or to act in ways that correctly reflect the safety that exists.

Every specification should seek to be as small as possible, even if only
for the reasons of reducing and minimizing security/privacy attack surface(s).
By doing so we can reduce the overall security and privacy attack surface
of not only a particular feature, but of a module (related set of
features), a specification, and the overall web platform.

Examples

* [Mozilla](https://bugzilla.mozilla.org/show_bug.cgi?id=1313580) and
    [WebKit](https://bugs.webkit.org/show_bug.cgi?id=164213)
    dropped the Battery Status API
* [Mozilla dropped](https://bugzilla.mozilla.org/show_bug.cgi?id=1359076)
    devicelight, deviceproximity and userproximity events

<h3 id="privacy-impact-assessment">
  Making a privacy impact assessment
</h3>

Some features potentially supply sensitive data, and it is
the responsibility of the end-developer, system owner, or manager to realize
this and act accordingly in the design of their system. Some use may
warrant conducting a privacy impact assessment, especially when data
relating to individuals may be processed.

Specifications that include features that expose sensitive data should include
recommendations that websites and applications adopting the API conduct a
privacy impact assessment of the data that they collect.

A feature that does this is:

* [[GENERIC-SENSOR]] advises to consider performing of a privacy impact
    assessment

Documenting these impacts is important for organizations although it should
be noted that there are limitations to putting this onus on organizations.
Research has shown that sites often do not comply with security/privacy
requirements in specifications.  For example, in [[DOTY-GEOLOCATION]], it was
found that none of the studied websites informed users of their privacy
practices before the site prompted for location.

<h2 id="acknowledgements" class="no-num">Acknowledgements</h2>

<!-- Current TAG participants who would otherwise be ackked are
     commented out. Please uncomment them when their term on the TAG
     ends! -->
Many thanks to
Alice Boxhall,
Alex Russell,
Anne van Kesteren,
Chris Cunningham,
Coralie Mercier,
Corentin Wallez,
<!-- Daniel Appelquist, -->
David Baron,
Domenic Denicola,
Dominic Battre,
<!-- Hadley Beeman, -->
Jeffrey Yasskin,
Jeremy Roman,
Jonathan Kingston,
<!-- Kenneth Rohde Christiansen, -->
Marcos Caceres,
Marijn Kruisselbrink,
Mark Nottingham,
Martin Thomson,
Michael(tm) Smith,
Mike Perry,
Nick Doty,
Robert Linder,
Piotr Bialecki,
<!-- Rossen Atanassov, -->
Samuel Weiler,
<!-- Sangwhan Moon, -->
Tantek Çelik,
Thomas Steiner,
<!-- Yves Lafon, -->
Wendy Seltzer,
and
the many current and former participants in PING and the TAG
for their contributions to this document.

Special thanks to
Rakina Zata Amni
for her edits which help spec authors take the <abbr
title="back/forward cache">bfcache</abbr> into account.

Mike West
wrote the initial version of this document
and edited it for a number of years.
Yan Zhu
took over from Mike and, in turn,
Jason Novak
and
Lukasz Olejnik
took it over from her.
The current editors are indebted to all of their hard work.
We hope we haven't made it (much) worse.

<pre class="anchors">
urlPrefix: https://www.w3.org/TR/encrypted-media/; spec: ENCRYPTED-MEDIA
    text: content decryption module; url: #cdm; type: dfn
urlPrefix: https://privacycg.github.io/storage-access/; spec: STORAGE-ACCESS
    text: first-party-site context; url: #first-party-site-context; type: dfn
    text: third-party context; url: #third-party-context; type: dfn
urlPrefix: https://html.spec.whatwg.org/multipage/; spec: HTML
    text: PDF viewer plugin objects; url: system-state.html#pdf-viewer-plugin-objects; type: dfn
</pre>

<pre class="link-defaults">
spec:html; type:attribute; text:localStorage
spec:html; type:element; text:link
spec:html; type:element; text:script
spec:indexeddb-3; type:attribute; text:indexedDB
</pre>

<pre class="biblio">
{
  "ADDING-PERMISSION": {
    "href": "https://github.com/w3cping/adding-permissions",
    "title": "Adding another permission? A guide",
    "authors": [ "Nick Doty" ],
    "publisher": "W3C Privacy Interest Group"
  },
  "COMCAST": {
      "href": "http://arstechnica.com/tech-policy/2014/09/why-comcasts-javascript-ad-injections-threaten-security-net-neutrality/",
      "title": "Comcast Wi-Fi serving self-promotional ads via JavaScript injection",
      "publisher": "Ars Technica",
      "authors": [ "David Kravets" ]
  },
  "DOTY-GEOLOCATION": {
    "href": "https://escholarship.org/uc/item/0rp834wf",
    "title": "Privacy Issues of the W3C Geolocation API",
    "authors": [ "Nick Doty, Deirdre K. Mulligan, Erik Wilde" ],
    "publisher": "UC Berkeley School of Information"
  },
  "GYROSPEECHRECOGNITION": {
    "href": "https://www.usenix.org/system/files/conference/usenixsecurity14/sec14-paper-michalevsky.pdf",
    "title": "Gyrophone: Recognizing Speech from Gyroscope Signals",
    "publisher": "Proceedings of the 23rd USENIX Security Symposium",
    "authors": [ "Yan Michalevsky", "Dan Boneh", "Gabi Nakibly"]
  },
  "HOMAKOV": {
      "href": "http://homakov.blogspot.de/2014/01/using-content-security-policy-for-evil.html",
      "title": "Using Content-Security-Policy for Evil",
      "authors": [ "Egor Homakov" ]
  },
  "OLEJNIK-ALS": {
    "href": "https://blog.lukaszolejnik.com/privacy-of-ambient-light-sensors/",
    "title": "Privacy analysis of Ambient Light Sensors",
    "publisher": "Lukasz Olejnik",
    "authors": [ "Lukasz Olejnik" ]
  },
  "OLEJNIK-BATTERY": {
    "href": "https://eprint.iacr.org/2015/616",
    "title": "The leaking battery: A privacy analysis of the HTML5 Battery Status API",
    "publisher": "Cryptology ePrint Archive, Report 2015/616",
    "authors": [ "Lukasz Olejnik", "Gunes Acar", "Claude Castelluccia", "Claudia Diaz"]
  },
  "OLEJNIK-PAYMENTS": {
    "href": "https://blog.lukaszolejnik.com/privacy-of-web-request-api/",
    "title": "Privacy of Web Request API",
    "authors": [ "Lukasz Olejnik" ],
    "publisher": "Lukasz Olejnik"
  },
  "RIVERA": {
    "href": "https://gist.github.com/jherax/a81c8c132d09cc354a0e2cb911841ff1",
    "title": "Detect if a browser is in Private Browsing mode",
    "authors": [ "David Rivera" ],
    "publisher": "David Rivera"
  },
  "TIMING": {
      "href": "http://www.contextis.com/documents/2/Browser_Timing_Attacks.pdf",
      "title": "Pixel Perfect Timing Attacks with HTML5",
      "authors": [ "Paul Stone" ],
      "publisher": "Context Information Security"
  },
  "VERIZON": {
      "href": "http://adage.com/article/digital/verizon-target-mobile-subscribers-ads/293356/",
      "title": "Verizon looks to target its mobile subscribers with ads",
      "publisher": "Advertising Age",
      "authors": [ "Mark Bergen", "Alex Kantrowitz" ]
  },
  "WU-PRIVATE-BROWSING": {
    "href": "https://dl.acm.org/citation.cfm?id=3186088",
    "title": "Your Secrets Are Safe: How Browsers' Explanations Impact Misconceptions About Private Browsing Mode",
    "publisher": "WWW '18 Proceedings of the 2018 World Wide Web Conference",
    "authors": [ "Yuxi Wu", "Panya Gupta", "Miranda Wei", "Yasemin Acar", "Sascha Fahl", "Blase Ur"]
  },
  "YUBIKEY-ATTACK": {
      "href": "https://www.wired.com/story/chrome-yubikey-phishing-webusb/",
      "title": "Chrome Lets Hackers Phish Even 'Unphishable' YubiKey Users",
      "authors": [ "Andy Greenberg" ],
      "publisher": "Wired"
  }
}
</pre>