Difference between revisions of "WebXray domain ownership list"
Line 29: | Line 29: | ||
==Why it is important== | ==Why it is important== | ||
− | * | + | * The domain ownership list needs to be populated to offer an index of simply where personal data is being sent, and where the legal ownership of the domains rests. |
− | * third party tracking is | + | * Without these fields, the filing of SARs is far more time consuming, and mass inspections of websites are barely intelligible. |
+ | |||
+ | The third party tracking ecosystem is currently free to operate and monetise individuals personal data without transparency. The domain ownership list is a step to addressing this. | ||
+ | |||
==Why it is important to build it collaboratively== | ==Why it is important to build it collaboratively== | ||
* can do SARs, and build the whole pipeline of support around that | * can do SARs, and build the whole pipeline of support around that |
Revision as of 14:17, 1 August 2019
Drawn from the webXray project, this list provides a hierarchical accounting of what entities own commonly found third-party domains on the web.
The database is hosted at Github.
What the database contains
The list is a JSON file. Each entry in the list has the following fields:
id: a numeric identifier (integer) for the entry, this will change whenever the list is expanded and reindexed, do not count on it remaining stable
parent_id: if the entity has a parent owner, the id of the parent
owner_name: a string which is the name of the service (eg. 'Google Analytics') or the company ('Google') which owns the domain
aliases: an array of strings representing possible alternate spellings of the owner_name (eg. 'YouTube' and 'You Tube')
homepage_url: a string which is the url of the homepage of the service or company
privacy_policy_url: a string which is the url of the privacy policy of the service or company
notes: a string which has pertinent information as to why a domain was assigned to a given owner
country: the ccTLD for the country in which the service or company is based
uses: what a first-party uses the service for, note that first-party use may be different than the ultimate third-party use. For example, a site may use audience measurement tools from a third-party to gain insights into traffic, but the third-party may use this data for marketing.
platforms: where the domain has been observed, so far 'web', 'mobile', and 'email'
domains: an array of domian names (strings) which are owned by the given service or company
Why it is important
- The domain ownership list needs to be populated to offer an index of simply where personal data is being sent, and where the legal ownership of the domains rests.
- Without these fields, the filing of SARs is far more time consuming, and mass inspections of websites are barely intelligible.
The third party tracking ecosystem is currently free to operate and monetise individuals personal data without transparency. The domain ownership list is a step to addressing this.
Why it is important to build it collaboratively
- can do SARs, and build the whole pipeline of support around that
- can do visualisations, maybe not for the general public but at least for super users of PDIO to better understand progress
What needs to be done for PDIO
- define format for the adtech entries
- complexify the SAR tool in order to leverage third party situations
Wishlist WebXRay side
- Framework for contributions and easy format for submission.
- contributions!