Balancing Privacy and Utility: Query Release in Telemetry Data

Reva Agrawal·Jordan Lambino·Dhruv Patel
Yu-Xiang Wang·Bijan Arbab

GitHub·Report·Poster

What does it mean for data to be private? Some may believe that simply omitting an individual’s identification (name, SSN, email, etc.) provides a privacy guarantee to that user. The common misconception is that these data points are the sole identifiers of participants in a dataset. In reality, each data point in a dataset, whether related to demographics, behavior, transactions, or education, comprises part of an individual’s identity.

Privacy attacks can leverage data to reconstruct the entire “profile” of an individual, exposing information ranging from their address, to their finances, to their browsing history; in short, these data attacks can reveal information which seemed to be “private.”

Our project explores the application of privacy mechanisms for telemetry data logs. Hardware and software vendors such as Intel rely on telemetry to understand how products behave in the field. However, releasing aggregate statistics can still enable re-identification when combined with auxiliary information. The tension between the need for actionable insights and the obligation to protect individuals is the problem we address.

Project Overview

The Problem: Telemetry Analytics vs. Privacy

Modern hardware and software systems collect telemetry data to analyze device usage in real world environments. The data are stored in logs which reveal useful insights for analyzing and improving product performance, reliability, and feature adoption. However, these datasets often contain detailed, device-specific information that can induce privacy risks, even when user “anonymity” is guaranteed.

This project investigates how differential privacy can enable the release of telemetry aggregates or statistics while protecting individual privacy. We implement a differentially private query-release pipeline that introduces calibrated noise to aggregate queries while clipping each device’s contribution to the results.

Telemetry Dataset

The dataset used in this study contains Intel telemetry logs collected from real-world devices. Each record is associated with a unique device identifier (GUID) and describes system events, such as battery usage, power consumption, browser activity, and hardware configuration.

The raw telemetry data contains 23 source tables, which we transform into 22 reporting tables after pre-processing via SQL build scripts. These reporting tables simplify complex SQL joins and allow the execution of 12 benchmark analytical queries used for evaluation.

In our work, we create two DuckDB databases:

  • ~5GB subsample for development
  • Full production database via SQL build script

Query Descriptions

As noted above, we work with 12 benchmark queries which Intel's engineering teams use for actual analysis. These queries represent analytical questions which reveal information ranging from battery health by geography, battery health by CPU generation, common software trends, to most popular browser by country. These queries provide a meaningful testbed for evaluating the effectiveness of differential privacy, along with finding an optimal balance between privacy and utility.

Query TableQuery Categorization

Differential Privacy Pipeline

  • Load the raw telemetry data into DuckDB. Run the build step to produce 22 reporting tables.
  • Run the 12 benchmark queries with per-GUID clipping in SQL to obtain the non-private baseline.
  • Run the Laplace and Analytic Gaussian mechanism at each ε in ε ∈ {0.01, 0.05, 0.1, 0.5, 1.0, ∞} (with ∞ representing no-noise reference) for both baseline and advanced variants.
  • Compute utility scores by median relative error, total variation distance, and Spearman rank correlation.
  • Evaluation computes per-query statistics and Laplace vs. Gaussian comparison across ε.
  • Evaluation outputs privacy-utility tradeoff curves, pass rate, and Pareto frontier metrics.

The flowchart below illustrates the complete pipeline for this study:

Results

Experiments were run on mini (subsample, ~5GB) and full databases for both the baseline and advanced variants.

Baseline vs. Advanced Pass Rates

Across all ε values, the Laplace mechanism consistently achieves equal or higher pass rates than the Gaussian mechanism, indicating better overall utility under the same privacy budgets. Compared to the baseline implementation, the advanced counterpart reaches higher pass rates earlier, meaning that more queries satisfy the accuracy thresholds at lower privacy budgets.

Baseline pass rate results
Pass rate results for the baseline variant on the mini dataset.
Baseline pass rate results
Pass rate results for the advanced variant on the mini dataset.

At ε=0.01 few queries pass the threshold, while at ε=∞, around 80% of queries pass on mini, with some queries missing due to structural qualities or small groups. The advanced variant achieves a similar or better pass rate at ε=0.5 compared to the baseline at ε=1.0.

Privacy-Utility Tradeoff Curves

As ε increases, noise decreases and query accuracy improves across all three metrics.

  • Median relative error drops
  • Total variation distance shrinks
  • Spearman rank correlation rises

The advanced variant (bottom) reaches acceptable utility at a lower ε=∞ in comparison to the baseline (top), thus achieving the same accuracy while ensuring stronger privacy.

Baseline privacy-utility tradeoff
Privacy-utility tradeoff visualization for the baseline variant.
Advanced privacy-utility tradeoff
Privacy-utility tradeoff visualization for the advanced variant.

At low ε error is high, and as ε --> ∞, utility approaches the non-private baseline. Advance curves plateau earlier, meaning comparable utility is achieved at stronger privacy.

Mechanism Comparison

The Laplace Mechanism achieves lower error and lower distribution distortion compared to the Gaussian at each ε value tested, across both RE and TVD metrics. The gap narrows as ε increases, but the Laplace maintains an advantage throughout in both baseline and advanced implementations.

Baseline mechanism comparison
Mechanism comparison for the baseline variant.
Advanced mechanism comparison
Mechanism comparison for the advanced variant.

Optimal ε Selection

Each mechanism was evaluated across the ε grid and computed a utility-preservation score relative to the non-private baseline at ε=∞. The recommended ε is selected as the smallest value for which the average utility preservation across all queries reaches at least 80%. These results are summarized below:

Best Epsilon

The table above includes negative utility values since the results are compared to the non-private baseline where ε=∞. It’s important to note that this result should not raise concerns; on the mini database, no ε in the grid reached the 80% mean preservation target, so the selection logic correctly falls back to the ε with the highest score (best available). Both chosen operating points (ε = 1.0 for the baseline, ε = 0.5 for the advanced variant) are fully differentially private; only ε = ∞ is non-private.

Methods

Differential Privacy

Differential privacy provides a mathematical framework, guaranteeing that the output of a query is unaffected by the inclusion or exclusion of a single individual record from the dataset.

Mathematically, differential privacy is represented by the following equation:

Pr[M(D) ∈ S] ≤ eε · Pr[M(D′) ∈ S]
  • Where ε is the privacy parameter which controls the privacy-utility trade-off
  • Smaller values of ε result in stronger privacy at the cost of utility

Contribution Bounding

To account for the disproportionate impact of some devices towards aggregates, we bound or clip each device’s contribution prior to computing statistics. For a value x:

xclipped = min(x, C)

Where C is a predefined clipping bound. This helps limit global sensitivity across the dataset and ensures that “outlier” devices can’t singlehandedly influence results.

Privacy Mechanisms

In order to guarantee individual privacy, we add noise to the query outputs using two differential privacy mechanisms. A brief description of each mechanism can be seen in the figure below:

DP Mechanisms

Implementation Variants

  • Baseline: Recomputes the noise scale fresh for every (query, column, ε) triple at run time.
  • Advanced: Builds a scale cache once per (sensitivity, ε) pair and looks it up during release; typically 2-4x faster with identical outputs and privacy guarantees.

Privacy Budget

When multiple queries are released, privacy loss accumulates through sequential composition. We evaluate the system across multiple privacy budgets:

ε ∈ {0.01, 0.05, 0.1, 0.5, 1.0, ∞}

In doing so, we can analyze the impact of different privacy levels on utility.

Evaluation Metrics

The table below includes three evaluation metrics, along with the corresponding queries for each one.

Evaluation Metrics

Discussion

Key Takeaways

  • Laplace Mechanism outperforms Gaussian Mechanism on low-dimensional scalar queries for the majority of ε values, winning on 8 out of 10 queries.
  • Gaussian Mechanism performs well for high-dimensional outputs (Q7, Q8, Q9), as its geometry features suppress noise more efficiently than the Laplace.
  • At ε=1.0 on the full database, most queries pass their utility thresholds with Laplace achieving median RE (relative error) < 0.03 for Q1, Q2, Q9 and Spearman p ≥ 0.99 for Q3.
  • The advanced variant achieves comparable accuracy to baseline at ε=0.5 vs. ε=1.0, delivering similar utility results while providing a stronger privacy guarantee.

Practical Implications

For Intel telemetry analysis (and other similar work), releasing the 12 query answers at ε=1.0 per query provides a concrete deployment point. Most queries achieve pass thresholds, and the guarantee is auditable from the published sensitivity and mechanism. Queries such as Q4 (vendor percentage), Q6 (browser winner), and Q12 (ranked processes on mini) can be improved further through increased budget allocation, implementing mechanism-specific changes, or particular tuning for releases. In practice, our pipeline is reproducible (fixed seed, versioned scripts) and can be re-run on updated data or with different ε allocations.

Limitations

A clear limitation lies in the fact that our project analyzes 12 out of the 22 total Intel queries. Beyond the fixed set of queries we were given, there are still a number of possible queries which future engineers may wish to explore.

An additional limitation involves dataset scale, which could be primarily attributed to resource constraints. Given that the size of the original dataset exceeded 22TB, our group needed to narrow our scope and apply privacy frameworks to a significantly smaller subsample. With access to more time and compute resources, we may be able to stratify our findings.

An important constraint to consider is the number of DP mechanisms which were explored in our work. Additional approaches such as the Exponential Mechanism exist which could theoretically provide high privacy and high utility for analytics.

Future Outlook

Our work can be extended to explore adaptive privacy budgets, improved query sensitivity analysis, and deployment in large production telemetry systems. Moreover, this project implements basic composition, whereas advanced composition might tighten the composed ε for the same per-query budgets.

Acknowledgements

We would like to thank Dr. Yu-Xiang Wang for his mentorship and guidance, along with industry fellow Dr. Bijan Arbab for his continued support throughout our work.

To reproduce our project or extend our work, visit our GitHub repository and follow the README.md for instructions.