In Google Analytics 4 (GA4), there exists a feature that often goes unnoticed—the Reporting Identity feature. Many users tend to overlook this tool, unaware of its significance in understanding user behavior and making informed decisions. In this blog post, we will help shed light on this feature and answer why exploring it holds significant importance in analyzing your data.
So, we’re inviting you to read the blog post and gain insights that will not only boost your confidence in analyzing user behavior but also help you make data-driven decisions. By the end of this post, you'll have a newfound appreciation for this often-overlooked feature in GA4, and become fully aware of its potential to transform how you perceive and use analytics data. So, let's dive in and unlock the knowledge behind GA4's Reporting Identity feature together.
Why Google Needs Reporting Identities in GA4
Here are some of the reasons Google needs reporting identities in GA4
To Bridge the Device Gap
Problem: Users interact with businesses across multiple devices (phone, laptop, tablet, etc.). Traditional tracking methods like cookies often struggle to connect these fragmented interactions.
Solution: Reporting identities help GA4 stitch these fragments together, creating a unified view of the user journey across devices. This provides a more complete picture of user behavior and preferences.
To Go Beyond Cookies
Problem: Cookie tracking is becoming increasingly unreliable due to privacy concerns and browser blocking. This means many user interactions go unseen in analytics.
Solution: Reporting identities offer alternative ways to identify users, even without cookies. This ensures crucial data isn't lost and allows for more accurate analysis.
To Unlock Cross-Platform Insights
Problem: Traditional analytics platforms tend to silo data by platform (website, app, etc.), making it difficult to understand how users move across them.
Solution: Reporting identities enable cross-platform analysis. You can see all user touchpoints - from website visits to app interactions - within a single platform, providing a unified view of your user base and their behavior across different channels.
To Power Personalization and Marketing
Problem: Without a complete understanding of user journeys, it's challenging to personalize marketing efforts and content.
Solution: By understanding how users move across devices and platforms, reporting identities offer powerful insights into their preferences and behavior. This information can be used to personalize your marketing messages, deliver relevant content, and ultimately, create a more satisfying experience for your users.
To Maintain Privacy Compliance
Problem: Privacy regulations like GDPR and CCPA require businesses to handle user data responsibly.
Solution: Reporting identities can help you anonymize and obfuscate user data, ensuring you're complying with these regulations while still leveraging valuable insights for analysis.
What are GA4 Reporting Identities?
Google Analytics 4 (GA4) uses four "identity spaces" to identify users across devices and platforms, providing a more holistic view of user journeys. Here's a breakdown of each:
User ID
- This is a unique identifier you assign to logged-in users.
- It's the most accurate way to track users across devices because it remains constant regardless of platform.
- Requires user logins and integration with your user database.
Google Signals (Discontinued)
- Leverages data from signed-in Google accounts across websites and apps.
- Works only if users both opt-in to data sharing and are signed in to their Google account.
- Provides cross-device insights without requiring user logins on your site.
But let’s stay in this identity space for a while. Recently the Analytics community was surprised by the announcement that Google signals will be removed from the GA4 Reporting Identity on February 12, 2024. This change will apply to all Google Analytics 4 properties and will only affect reporting features.
Users are given the option to flip the switch to turn off "Include Google Signals in Reporting Identity" in GA4 UI. This means reports won't show certain demographics and interests from people who are signed in and have given their consent. This is especially helpful for Blended or Observed identities because it reduces the chance of reaching data limits or the term use in Analytics, data thresholding. Don't worry, Google will still collect this data for audiences, conversions, and sharing with your linked Google Ads account for things like remarketing and bid tweaks.
Device ID
- Unique identifier for each device accessing your site or app.
- Less accurate than User ID or Google Signals due to multiple users sharing devices.
- Can still be valuable for understanding device-specific behavior.
Modeling
- GA4 estimates user identity when other methods fail (e.g., no available User ID, Google Signals disabled, or cookie blocking).
- Uses statistical models based on various data points like IP address, browser, and behavior.
- Less precise than other methods but provides insights when direct identification is impossible.
GA4 Reporting Identity Options
GA4 offers three reporting identity options that determine how these identity spaces are used:
- Blended: It uses User ID (first priority), then Google Signals (if it's enabled), then Device ID if neither User ID and Google Signals are not available, and finally resorting to Modeling when all other identifiers are lacking.
- Observed: Prioritizes observable data in the order of User ID, Google Signals, and Device ID (excludes Modeling).
- Device-based: Relies solely on Device ID, ignoring all other identity spaces.
Which GA4 Reporting Identity Should You Use?
The best Reporting Identity for you depends on several factors, including:
User Logins:
- If you have user logins, you can choose either Blended or Observed. Both prioritize User ID (the most accurate method) and provide cross-device insights. Observed excludes Modeling for stricter data privacy.
- If you don't have user logins you can consider Device-based or Blended (if Google Signals are enabled). Device-based will give you lower user counts (due to multi-user devices) but is simpler to implement.
Data Privacy:
- For more privacy, choose Observed as it excludes Modeling, which relies on statistical estimations. Device-based is also an option, but it lacks cross-device insights and might miss some users.
- If you’re looking for more accurate insights, go with Blended as it prioritizes User ID, offering cross-device tracking. However, it also uses Modeling if other methods fail, which might raise privacy concerns for some.
Traffic Volume:
- For low traffic websites you can consider Device-based can be sufficient as user identification might be limited anyway.
- For high traffic websites, Blended or Observed will likely provide more accurate user counts and insights due to a higher chance of capturing User ID or Google Signals.
Switching between reporting identities
You can easily switch between the Blended, Observed, and Device-based options at any time, without affecting your historical data.
Here's a quick breakdown of what you need to know - When to switch:
- If let’s say you’re experimenting and figuring out what’s the best option for you, then trying out different identities can help you understand how they impact your reports and user insights. See which one aligns best with your goals.
- If you start collecting User-IDs, you might want to switch from Device-based to Blended or Observed for more accurate insights.
- If data privacy is a top concern, you might switch to Device-based, although be aware of its limitations. Despite its data privacy advantage, it has several limitations that could impact its effectiveness in accurately tracking user journeys across devices. This is explained well at the end of this post.
How to switch:
1. Go to your GA4 property in the Admin section.
2. Under "Property settings" click on "Reporting identity."
3. Choose the desired option from the dropdown menu.
The little downward arrow on each of the three options provides additional details about the identity method.
4. Click "Save."
That's it! The changes will be applied instantly and reflected in your future reports. Remember, this switch does not retroactively alter your historical data. So, if you previously used Device-based, your past reports will still reflect that identity.
Here are some additional things to keep in mind:
- Impact on data thresholds: Switching to Blended or Observed might trigger data thresholds if you rely heavily on Google Signals. This could result in limited data availability in some reports.
- Communication with team members: If you share GA4 reports with others, inform them about any changes in reporting identity to avoid confusion.
Limitations of Reporting Identities
Data Threshold
One of the limitations of this feature is the likelihood of data reaching the limit and thus, won’t be able to generate the data accurately. For more information, check out this post: What is Data Thresholding in Google Analytics 4 Reports? by Ken Bandong.
Limitation of Device-Based GA4 Reporting Identity
If you’re using Device-based because you’re more concerned about data privacy, you might check these reasons why your decision should not come down to that. Using this option has its limitation which is inaccurately tracking user journeys across devices, and here are the reasons why.
- It is reliant on cookies. Cookies can be easily cleared or deleted, hindering accurate cross-device identification. Users may also use different browsers or devices without the same cookies, leading to fragmented tracking.
- This technique assigns a unique identifier based on device characteristics. However, it's not foolproof and can be spoofed or changed, leading to misattribution of activity to the wrong device or user.
- It reduces data volume because it only captures data from the specific device, meaning you miss out on insights from other devices used by the same user. This limits your understanding of their complete journey and preferences.
- It’s less accurate in modeling. Modeling techniques rely on additional data points like User IDs and Google Signals for improved accuracy. Without these, Device-based modeling has to rely solely on less reliable device identifiers, leading to potentially inaccurate estimations of user behavior.
Therefore, while Device-based Reporting Identity offers improved data privacy, it comes at the cost of potentially inaccurate and incomplete user data. Consider these limitations when deciding if it's the right choice for your needs.
Modeling
Modeling in GA4 doesn't provide precise measurements of user behavior. It uses statistical techniques to estimate user interactions and characteristics based on existing data patterns.
It’s also primarily employed within the Blended Identity option to fill gaps in data for users who cannot be identified through user IDs or device IDs.
In terms of reliability of modeled data, it depends significantly on two factors:
- Data Volume: Ample data provides a broader foundation for modeling algorithms to make accurate inferences. Limited data can lead to unreliable estimates.
- Data Quality: Well-structured and consistent data is crucial for effective modeling. Inaccurate or incomplete data can introduce biases and errors into modeled results.
With this, potential Issues might arise such as:
- Over-Attribution: Modeling might attribute too many events or actions to a single user, potentially inflating user engagement metrics.
- Under-Attribution: Conversely, it might underestimate user activity by failing to recognize cross-device behavior or attributing events to separate users.
- Inaccurate User Journeys: Modeled user journeys might not accurately reflect actual user behavior, leading to flawed interpretations of user paths and actions.
- Changing Behavior Patterns: Modeling techniques might not adapt quickly to sudden shifts in user behavior, resulting in inaccurate predictions or outdated insights.
Frequently Asked Questions about GA4 Reporting identity
Can I exclude data from specific identity spaces?
Yes, you can exclude Google Signals in GA4 UI, here are the steps:
- In your GA4 property, go to the "Admin" section.
- Under "Property," click on "Data Settings."
- Click on "Data Collection."
- Find the "Include Google Signals in Reporting Identity" option and toggle it to "Off."
- Click "Save" to apply the changes.
How does GA4 combine data from different identity spaces?
GA4 prioritizes User ID, then Google Signals, then Device ID, then Modeling. It uses probabilistic matching to link data points together when possible.
Can I manually add modeling in my GA4 Reporting Identity?
No, you cannot manually add modeling to your GA4 Reporting Identity. Modeling is an automated feature that Google applies to Blended and Observed identities to fill in gaps when higher-priority identifiers (User-ID or Google Signals) are missing. It's not a setting you can directly control.
Final Word
In conclusion, we hope that our exploration of Google Analytics 4 has brought to light the often underestimated Reporting Identity feature. Despite its subtle presence, this tool plays a crucial role in understanding user behavior and helps you make informed decision-making. Throughout this blog post, our goal has been to emphasize the importance of learning this feature for effective data analysis.
As you contemplate the insights shared, and now we suppose you’re already armed with a deeper understanding, we encourage you to leverage this knowledge into your analytical pursuits. The Reporting Identity feature, often missed, emerges as a powerful tool in analytics. Thank you for joining us on this exploration, and we trust that your future engagements with GA4 will be enhanced by a newfound awareness of its Reporting Identity feature.
Thank you for reading and Happy analyzing!
We're always looking for ways to improve our Google Analytics 4 blog content. Please share your feedback so we can make it even better.