Data Security¶
In the era of cloud computing, data security is crucial. Possessing comprehensive data protection capabilities can enhance visibility and insights, automatically warn of security risks, thereby improving overall defense capabilities, ensuring data availability and security compliance.
When using TrueWatch, its built-in tools will assess and process the received data for risks.
How to Reduce Data Risks?¶
TrueWatch collects monitoring information from your infrastructure and services, and manages it centrally, making it convenient for you to analyze and process at any time. During this process, servers transmit various types of data. Normally, servers using TrueWatch send various types of data content. Most of the data collected through normal use of TrueWatch products does not contain personal privacy information. For non-essential personal data that may be included, we provide detailed explanations and recommendations to prevent confusion. TrueWatch offers multiple ways to help you reduce data risks.
Data Security Considerations on the DataKit Side¶
HTTPS Data Upload¶
All DataKit data is uploaded using the HTTPS protocol to ensure the security of data communication.
Limited Distribution Mechanism¶
The center cannot issue commands to DataKit for execution; all requests are actively initiated by DataKit. DataKit can only periodically pull some related configurations (such as Pipeline and blacklist configurations) from the center. The center cannot issue commands to DataKit for execution.
Field Value Desensitization During Tracing Collection¶
During the Tracing collection process, some SQL statement execution processes may be collected, and the field values of these SQL statements will be desensitized, for example:
Will be desensitized to
Pipeline and Blacklist Mechanism¶
If there is indeed some sensitive data in the data that cannot be removed during the collection process, then specific functions in Pipeline (such as the cover()
function, which can replace some parts of the string with *
) can be used to desensitize some sensitive data (such as phone numbers, etc.).
In addition, by configuring blacklist rules, the upload of some sensitive data can also be prevented.
Sensitive Data Scanning¶
The sensitive data scanning feature can be used to identify, mark, and edit data containing personal privacy and many other risky data. As a security line of defense, it can effectively prevent sensitive data from leaking out.
For more details, please refer to Sensitive Data Scanning.
Logs¶
During the use of TrueWatch product services, many log records are generated. Due to the strong correlation of log data itself, specific rules need to be applied during the collection and analysis process to filter massive log data.
By configuring sensitive fields for log data, members with corresponding permissions can only see desensitized log data.
Data access control is another key method to reduce log data security risks. By configuring corresponding log data access query ranges for different roles, data isolation is achieved, achieving comprehensive management and filtering of sensitive data.
For more details, please refer to Multi-Role Data Access Control.
Snapshots¶
The snapshot service of TrueWatch, as an instant data copy, contains abnormal data screening conditions and data records. When facing the need to share monitoring data, by setting data desensitization rules or deciding on the sharing method when sharing snapshots, access links with specified viewing permissions can be generated, automatically forming a data protection shield.
For more details, please refer to Snapshots.
RUM¶
When collecting related data on user access, the RUM (Real User Monitor) SDK will customize and intercept the data to prevent the flow of sensitive data.
For more details, please refer to SDK Data Interception and Data Modification.
Session Replay Privacy Settings¶
Session Replay provides privacy controls to ensure that no company exposes sensitive data or personal data. And the data is stored encrypted. The default privacy options for Session Replay are designed to protect end-user privacy and prevent sensitive organizational information from being collected.
Global Configuration¶
By enabling Session Replay, sensitive elements can be automatically masked, preventing them from being recorded by the RUM SDK.
To enable your privacy settings, set defaultPrivacyLevel to mask-user-input, mask, or allow in your SDK configuration.
import { datafluxRum } from '@cloudcare/browser-rum'
datafluxRum.init({
applicationId: '<DATAFLUX_APPLICATION_ID>',
datakitOrigin: '<DATAKIT ORIGIN>',
service: 'browser',
env: 'production',
version: '1.0.0',
sessionSampleRate: 100,
sessionReplaySampleRate: 100,
trackInteractions: true,
defaultPrivacyLevel: 'mask-user-input' | 'mask' | 'allow',
})
datafluxRum.startSessionReplayRecording()
After updating the configuration, you can use the following privacy options to override the elements of the HTML document:
Mask user input mode: Masks most form fields, such as inputs, text areas, and checkbox values, while recording all other text as is. Inputs are replaced with three asterisks (***), and text areas are obfuscated with x characters that preserve space.
Note
By default, mask-user-input
is the privacy setting when session replay is enabled.
Mask mode: Masks all HTML text, user input, images, and links. The text on the application is replaced with X, rendering the page as a wireframe.
Allow mode: Records all data.
Some limitations:
For data security considerations, regardless of the defaultPrivacyLevel
mode you configure, the following elements will be masked:
- Input elements of type password, email, and tel;
- Elements with the
autocomplete
attribute, such as credit card numbers, expiration dates, and security codes.
Custom Configuration¶
Session Replay supports the masking of sensitive elements, and you can flexibly set the content to be masked according to business needs, such as phone numbers and other sensitive information. The following are specific methods:
Configuring Masking Through Element Attributes¶
You can add the data-gc-privacy attribute to elements that need to be masked, supporting the following four attribute values:
• allow: Allows data collection, no masking.
• mask: Masks content, displaying content in a masked form.
• mask-user-input: Masks user input, preventing the recording of sensitive input data.
• hidden: Completely hides content.
Example code:
<!-- Allows data collection -->
<div class="mobile" data-gc-privacy="allow">13523xxxxx</div>
<!-- Masks content -->
<div class="mobile" data-gc-privacy="mask">13523xxxxx</div>
<!-- Masks user input -->
<input class="mobile" data-gc-privacy="mask-user-input" value="13523xxxxx" />
<!-- Hides content -->
<div class="mobile" data-gc-privacy="hidden">13523xxxxx</div>
Configuring Masking Through Element Class Names¶
Supports masking by adding specific class names to elements. Currently, the following class names are supported:
• gc-privacy-allow: Allows data collection. • gc-privacy-mask: Masks content. • gc-privacy-mask-user-input: Masks user input. • gc-privacy-hidden: Completely hides content.
Example code:
<!-- Allows data collection -->
<div class="mobile gc-privacy-allow">13523xxxxx</div>
<!-- Masks content -->
<div class="mobile gc-privacy-mask">13523xxxxx</div>
<!-- Masks user input -->
<input class="mobile gc-privacy-mask-user-input" value="13523xxxxx" />
<!-- Hides content -->
<div class="mobile gc-privacy-hidden">13523xxxxx</div>
Using shouldMaskNode
to Implement Custom Node Masking Strategies¶
In some special scenarios, it may be necessary to customize the masking of specific DOM nodes. For example, in applications with high security levels, it may be desirable to uniformly mask all text content containing numerical values on the page. This requirement can be achieved by configuring the shouldMaskNode
callback function to implement more flexible privacy control strategies.
import { datafluxRum } from '@cloudcare/browser-rum'
datafluxRum.init({
applicationId: '<DATAFLUX_APPLICATION_ID>',
datakitOrigin: '<DATAKIT ORIGIN>',
service: 'browser',
env: 'production',
version: '1.0.0',
sessionSampleRate: 100,
sessionReplaySampleRate: 100,
trackInteractions: true,
defaultPrivacyLevel: 'mask-user-input' | 'mask' | 'allow',
shouldMaskNode: (node, privacyLevel) => {
if (node.nodeType === Node.TEXT_NODE) {
// If it is a text node, check if the content contains numbers
const textContent = node.textContent || ''
return /\d+/.test(textContent)
}
return false
},
})
datafluxRum.startSessionReplayRecording()
In the above example, the shouldMaskNode function will judge all text nodes. If the content contains numbers (such as amounts, phone numbers, etc.), it will automatically perform masking processing, thereby enhancing the privacy protection capability of user data.
Some Recommendations
-
Priority Rules:
• If both data-gc-privacy attributes and class names are set, it is recommended to determine the priority according to the project documentation.
-
Use Cases:
• allow: Suitable for regular data that does not require masking.
• mask: Suitable for sensitive data that needs to be displayed in a masked form, such as phone numbers.
• mask-user-input: Suitable for scenarios where input content needs to be protected, such as password fields.
• hidden: Suitable for content that you do not want to display or record. -
Best Practices:
• Prioritize simple and clear methods (such as class names or attributes) to ensure accurate configuration.
• In high-sensitivity data scenarios, such as user privacy forms, it is recommended to use mask-user-input or hidden.
Through the above methods, you can flexibly configure the masking rules of sensitive elements, improve data security, and meet business compliance requirements.