Hi! I tried visiting the wiki here on github, but I can't find what

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

What is considered logging? (in reference to Cloudflare) about dnscrypt-resolvers HOT 11 CLOSED

dnscrypt commented on May 28, 2024

What is considered logging? (in reference to Cloudflare)

from dnscrypt-resolvers.

Comments (11)

jedisct1 commented on May 28, 2024 4

There has never been a formal definition of a non-logging resolver, but this is a very important topic, and something that we should define all together.

Logging the client IP address, even temporarily, should probably clear the 'non-logging' bit immediately.

Now, what about logging queries and responses?

Even without client IP addresses, this can leak sensitive information.

While a unique sequence of queries does not reveal the client IP, it reveals when that device is online.

More importantly, DNS queries, even to nonexistent names, reveal information about the network, what software is being used and more.

For example, queries for testing-secret-internal-project.bankofamerica.com could reveal the address of something that was originally not supposed to be public.

Another issue is that when a query for a nonexistent name is made, operating systems can be configured to retry using the "default" domain (or even a set of domains, e.g. with the search property in resolv.conf). So, a Bank of America employee trying to access hardcorefishrubbingfetish.com would send a query for that name first, and fall back to a second query for hardcorefishrubbingfetish.com.bankofamerica.com.

While the first query doesn't reveal much information about the identity of the client, the second does.

A third issue, similar to the previous one, is browser autocompletion, that can also trigger the default suffix. So that search queries can end up as queries for <search query>.bankofamerica.com.

Unfortunately, this information is already public. Sensors recording queries and responses sent to authoritative servers are everywhere. Companies such as Cisco and Farsight log everything the see and sell access to their database. This data is stored forever. There are also many free services doing the same. This is very useful for security and marketing purposes.

Even data sent to a resolver that doesn't log may end up in these databases, because the sensors are placed between the resolvers and the authoritative servers, not between the client and the authoritative servers.

So, the consensus in the DNS community, maybe as a way to downplay the fact that DNSSEC doesn't provide any confidentiality, or that names can be brute-forced, has always been that "DNS data should be considered public".

If we agree with that, maybe the definition of "doesn't log" can just be "doesn't log the client IP, even temporarily".

from dnscrypt-resolvers.

irtefa commented on May 28, 2024 2

Hi,

I am the product manager for the 1.1.1.1 team. I can see why this can be confusing. We don't store anything that can actually tell us how many unique users we have for the public DNS resolver. We do internally sometimes make rough estimates based on the number of queries.

Here's what we actually log:

Timestamp
IP Version (IPv4 vs IPv6)
Cloudflare Resolver IP address + Destination Port
Protocol (TCP, UDP, TLS or HTTPS)
Query Name
Query Type
Query Class
Query Rd bit set
Query Do bit set
Query Size
Query EDNS enabled
EDNS Version
EDNS Requested Max Buffer Size
EDNS Nsid
Response Type (normal, timeout, blocked)
Response Code
Response Size
Records in Response
Response Time in Milliseconds
Response served from Cache
DNSSEC Validation State (secure, insecure, bogus, indeterminate)
PoP ID
Server ID
Autonomous System Number
(source: https://developers.cloudflare.com/1.1.1.1/commitment-to-privacy/privacy-policy/privacy-policy/)

We will work on making this clearer in our privacy policy.

from dnscrypt-resolvers.

publicarray commented on May 28, 2024 1

Good question!
I actually never read their policy but from this I'm not sure. I suppose it depends on an individuals thread model. To play it safe I agree we could remove the non-logging label.

Just as a reference here is what I'm doing on my server: https://dns.seby.io/stats.html All this really shows is how the server and the clients are behaving. I'm pretty sure that it's impossible to identify someone from these graphs. This is the only data I have. I use it to see how popular the service is and if I need to take manual action (e.g. when the graphs go down and stay at 0 or sky-rocket and someone is abusing the service)

From my graphs I could get aggregate data on the following :

Timestamp (in a few minute increments)
Query Type
Query Class
Query Rd bit set
Query Do bit set
Query EDNS enabled
Response Code
Response Time in Milliseconds
Response served from Cache
DNSSEC Validation State (secure, bogus)

I don't consider this as logging but I'm technically logging some information so maybe I should remove the no-logging label too? I don't know. It depend on an individuals thread model.

Maybe we should define logging such that if it's possible to identity a unique user or query from the logs it's logging else its non-logging? That definition still doesn't help much though.

For Cloudflare I think they may use unique identifiers to determine unique users in the 24 hour period. Than after 24h they just increment the "Number of unique users" counter. I don't know but I'm speculating. I do think they are pushing the no-logging envelope a bit though.

@jedisct1 What do you think

from dnscrypt-resolvers.

jedisct1 commented on May 28, 2024 1

The information Cloudflare logs doesn't seem to be enough to passively link queries to users, so the Number of unique users mention in their privacy policy is a bit concerning.

Maybe they make an rough estimate based on the number of queries, and the fact that on average, a user makes x queries per day.

Or maybe they temporarily use client IP addresses, independently from the payloads they send and receive, for throttling and DoS mitigation. That can be implemented at any layer, but a firewall rule that prevents a single client IP to send tons of queries in a short time fits in this category. Using client IP addresses that way is probably fine and should not void the "non logging" flag.

Number of unique users in their policy may refer to this.

Rather than speculating, maybe @vavrusa can clarify what exactly gets logged and what Number of unique users refers to?

from dnscrypt-resolvers.

jedisct1 commented on May 28, 2024 1

Thanks a lot for chiming in and for the clarification, Mohd!

So, shall we define "non-logging" as "doesn't log or use the client IP address, except for rate limiting, and without correlation with DNS queries"?

What do you think?

The "non-logging" bit is important, if only because by default, dnscrypt-proxy ignores resolvers having that bit set (and we probably shouldn't change that).

from dnscrypt-resolvers.

publicarray commented on May 28, 2024 1

Yes I’m happy with that 👍

from dnscrypt-resolvers.

irtefa commented on May 28, 2024 1

That's correct. We may use the IP address for rate limiting but we don't log them. Furthermore, they are not associated with DNS queries.

from dnscrypt-resolvers.

brainscar commented on May 28, 2024

Thank you both so much for your responses, I really appreciate the open discussion we're having.

I think this topic goes beyond just cloudflare, and that was not my intention to single them out.

In terms of what is considered logging I think there are at least 3 instances that we're dealing with:

no logging.
no logging, except for some un-identifying information.
not logging ip, but taking the rest to an extreme (in my opinion, cloudflare does this).

Which, begs the question: at what point does it become too much?

I agree with @jedisct1 about this:

Unfortunately, this information is already public.

For example, testing-secret-internal-project.bankofamerica.com could also be found by things like:

bruteforcing subdomains
checking for issued ssl certs, for example, github has ssl certs for these subdomains: https://crt.sh/?q=%25.github.com. As you can see, some of them are not for end-users.

(sorry @jedisct1 no fish rubbing at github yet.)

So in that sense I would agree with "ip logging is considered logging".

However, I think when we look at the list cloudflare logs, I do believe there is more to worry about than just queries and responses.

And that's where I would love to get your input about @jedisct1 and @publicarray.

You see if the query is public data but the ip address isn't, one could argue:

anyone could have made that request.

However if we look at that list, I don't think that statement applies anymore.
After all, if you narrow it down, that list is essentially an unique fingerprint, which then becomes attached to the query. And that's my concern: being able to put query and person together.

Thank you guys again, I hope we can continue this conversation.

from dnscrypt-resolvers.

irtefa commented on May 28, 2024

"non-logging" as "doesn't log or use the client IP address, except for rate limiting

Yes. IMO, that's fair.

from dnscrypt-resolvers.

brainscar commented on May 28, 2024

@irtefa could you please confirm the end of the sentence applies to cloudflare too?

doesn't log or use the client IP address, except for rate limiting, and without correlation with DNS queries.

Then as far as my opinion goes, I'm good with it too, as my only concern left was the one @jedisct1 mentioned here: #128 (comment)

from dnscrypt-resolvers.

captn3m0 commented on May 28, 2024

How about changing "log" to retain?

"doesn't retain the client IP address, except for rate limiting, and without correlation with DNS queries"?

For DoH resolvers, even things like User-Agent + ASN might be enough to identify users. so changing client IP address to "user identifiable information" might be better.

The Mozilla DoH resolver policy takes it up nicely: https://wiki.mozilla.org/Security/DOH-resolver-policy

from dnscrypt-resolvers.

What is considered logging? (in reference to Cloudflare) about dnscrypt-resolvers HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent