mnot / avoiding-internet-centralization Goto Github PK

View Code? Open in Web Editor NEW

42.0 42.0 8.0 794 KB

Internet-Draft about avoiding internet centralization

Home Page: https://mnot.github.io/avoiding-internet-centralization/

License: Other

Makefile 100.00%

avoiding-internet-centralization's People

Contributors

Stargazers

Watchers

Forkers

synctext dmfay gibson042 stefanosalsano tfpauly jfinkhaeuser rfc-editor

avoiding-internet-centralization's Issues

'out of control'

I'd like to work this quote in if possible (along with a reference to a very topical paper) if possible:

The Internet is "out of control" only in the much more limited sense that nature or the capitalist economy are out of control -- it is run on a distributed rather than a centralized basis

x-devonthink-item://96EBD419-DBD7-409B-A860-C016B549FE16?page=3

Evaluate New Decentralization Techniques

... is still a bit thin. Depending on how they're used, crypto techniques can promote as well as remove centralization risk.

More on trusted introduction: countersigning and resilience

Bringing two models together, centralization risks hit the PKI model pretty hard. We just saw an instance in which 2 million certificates were invalidated by LetsEncrypt because of a flaw in their verification code. But wouldn't it have been "nice" if people using LE certs could ALSO have had their CSRs signed by another authority Just In Case. But the standard doesn't really support that. THAT's something we could do something about in our standards, no?

And this brings me to a bigger point: I think you should highlight, perhaps with some fun references to papers of record or the like, the harm centralization has caused. What you are addressing isn't theoretical. This will link to my next issue as well.

Expand upon distributed consensus / blockchain

Federation

I think federation is an area worth a lot more exploration. We began discussing this a bit on list, and I really want to get into with you just a bit more.

When we look at the history of wifi, it is one of federation, and it is a HUGE success. Thanks to my dear friend and former colleague Klaas Wierenga, students, researchers, and professors can travel to just about any university and connect. It's lovely. But it's not the only federation. We can consider iPass a commerical federation, and now there is Open Roaming. We're about to see the same sorts of mechanisms spread to private 5g.

But federation, as you say, does have its limitations. In the case of public 5g and anything of that scale, capital costs are so prohibitive as to limit the number of entrants. Is this Internet centralization? Maybe not, but a key factor seems to me to be that universities and most enterprises aren't in the business of offering wifi, so they don't mind its use so long as its not abuse. More on that in another issue ;-)

You're being a tease in Section 6.3

The heading for this section is:

6.3. Build Well-Balanced Standards

But then you don't define what a Well-Balanced Standard is!!

tradeoffs

esp. vs. efficiency (latency, power and other costs).

More on DMA

You write in Section 6.2:

Keen readers will point out that social networking is effectively
centralized despite the existence of such standards (see, e.g.,
[W3C.CR-activitystreams-core-20161215]), because the IETF and W3C
create voluntary standards, not mandatory regulations.

Probably you do want to mention DMA here. It's not just keen readers anymore.

Sybil attack description does not follow classical definition

I am reading this draft and my general impression is "great work" and "this should be published and promoted." But then, I find this sentence in the section about the blockchain-based designs:

Sybil attacks (where enough participants coordinate their activity to affect the protocol's operation) are a major concern for these protocols.

Sybil attacks don't do that. Sybil attacks typically affect systems in which the system state reflects the consensus of a majority of the "participants". The wikipedia entry correctly states "A Sybil attack is a type of attack on a computer network service in which an attacker subverts the service's reputation system by creating a large number of pseudonymous identities and uses them to gain a disproportionately large influence."

The various kinds of "proofs" found in blockchain-based systems attempt to Sybil attack by only considering consensus among those pseudonymous entries that meet some condition. For example, systems based on "proof of work" require participants to consume a large quantity of energy before obtaining a "vote", and systems based on "proof of stake" only considers those participants that can "stake" a large quantity of some resource.

Timing and stakeholder involvement in the standards process, as well as use-it-or-lose-it

Is it worth asking the question about how to bring in a broader swathe of stakeholders such that the standards can be developed that allow for both flexibility and interoperability? For instance, how might we get a selection of new entrants and entrenched players to play together? My personal observations at M3AAWG lead me to believe that this may be a very hard problem, but the IETF seems to do its best work when you have a good mix of operators and developers (made easy when we're in the space of devops ;-)

Similarly, are there RFC 9170 issues hiding here that also need to be considered when it comes to centralization?

balancing against other goals

E.g., privacy sometimes comes up in tension (at least purportedly). However, privacy is much more effectively regulated by architecture than competition; play to our strengths.

Questions

It might be interesting to pose some questions that should be asked about proposals for centralized or decentralised technologies, in the spirit of 'What's the Problem Represented to be?'.

Some food for thought here.

Be careful about RFC 8890 and the like

You write:

Second, as the Internet's first duty is to the end user [RFC8890],
allowing such power to be concentrated into few hands is counter to
the IETF's mission of creating an Internet that "will help us to
build a better human society."

It's not that I disagree, but this may strike some as bloviation. You don't need it here. Regardless, You may wish to state that centralized functions have the potential to disrupt huge swathes of critical societal functions without notice (cf my earlier issue).

Talk more about data?

Data is often brought up as a compounding factor in competition discussions, and it's likely the same for centralization.

This might also be an opportunity to talk about:

data mininmisation
focus on formats rather than APIs where appropriate

Is your organization and of the types of centralization correct?

You discuss direct, indirect, inherited, and platform centralization. Is this model the appropriate delineation? Is there more direct language you can use to describe these forms?

access to data also creates barriers to entry

relate to observation

random reading

Centralisation and concentration

Distinguish them. E.g., many people use gmail, but aren't required to by the architecture or network effects, etc. It's concentrated, but not centralized.

Moar Habermas

Habermas's own discussion of a concrete political agenda includes recommendations for increased decentralization in order to allow pluralistic decisionmaking. Decentralization also serves to counteract the "generation of mass loyalty" sought (and increasingly, he believes, achieved) by mass institutions such as political parties and states.

Ref to KENNETH BAYNES, THE NORMATIVE GROUNDS OF SOCIAL CRITICISM 109 (1992) 179-80.

x-devonthink-item://4EFBE3F1-1988-414A-870E-44F21A17B8EA?page=17

Structural Separation

A further regulatory problem with natural monopolies is to identify their boundaries. One of the key arguments in favour of the vertical separation of infrastructure industries, for example, has been that this allows the introduction of competition in potentially competitive areas of the industry. However, often the ability to ‘separate out’ particular parts of an industry is dependent on the availability of technological devices.

-- Martin Lodge and Kai Wegrich, "Managing Regulation" (Macmillan, 2008), 20-1.

This ties nicely to SDO's ability to identify and create those 'pivot points' for separating out functions.

Privacy

Jari says:

Section 6.1 talks about relationship between centralization and privacy, and points out that there's sometimes a tradeoff. I would argue that this is heavily dependent on what we're talking about. We're far along on the mission to encrypt all communications. Many of the remaining privacy issues are made worse, not better, by centralization. E.g., one entity has access to all your mail or browser history or other things. Perhaps the 6.1 language could be improved, as it currently reads a bit like focusing on privacy in standardisation and leaving centralisation to other entities. I’m not sure that’s always desirable, particularly if we want to address privacy.

Intermediaries or agents?

As discussed elsewhere, I wonder if whether the discussion should be about agents rather than intermediaries. Sometimes these agents are intermediaries, but sometimes they are authentication servers or perhaps other things.

Define intermediation

Watson said:

I think the comments on intermediaries in
draft-nottingham-avoiding-internet-centralization need some more work.
It's not clear to me why intermediation should be considered to lead
to centralization. Often in examples I've seen the intermediation is
not a network but a mediator between the two ends. This can be
centralizing due to two sided market dynamics or decentralizing due to
ready access to many intermediates and a reduction in trust invested
in an endpoint.

From a network perspective all devices transporting packets are
intermediaries between the hosts that ultimately generate them. While
encryption does reduce the possible intervention of these boxes,
there's quite a few cases where they do do things, and even without
centralization manage to undercut permissionless innovation.

... and I replied:

I may have been infected by a legal/policy perspective that assumes a much broader definition of 'intermediation' -- will try to clarify to get everyone on the same page.

Benefits of Centralization

In Sections 2 and 6.4 you do discuss "necessary centralization", but that discussion is couched almost as an aside, and it is not clear to me that "necessary" is the word you should use. I might call out "benefits" more directly, because it will help bring into focus the difficulty of the challenge. Using an example here might help. Here are two such:

The most significant environmental impact of computing is always at the edge. Each watt expended by consumer gear such as laptops, smart phones, tablets, and routers is multiplied by millions if not billions. The cost of environmental cost of production and disposal of each consumer device is similarly multiplied. Each watt saved by use of some amount of centralization is conversely of huge societal benefit. Everyone running their own SMTP local server would have a massive cost.

Similarly, the more distributed control, particularly to the users, requires more work by those same consumers to maintain and protect their components from vulnerabilities and exploits. Some centralization absolves end users of those responsibilities.

The question then turns to how to balance this benefit against the risks.

And this brings into question whether "intermediaries" in Section 6.4 is really the right term to use. I don't have a better one offhand, tho. Something on which to noodle?

By taking this approach, you express to the readers the seriousness and depth and respect that you treat the topic, and that it should demand of them.

Intermediary power differentials

E.g., forward proxy vs. reverse proxy

depends on point of interposition, agency

highlight differences in regulation mechanisms

e.g., architectural tends to be 'hard' -- not possible to overcome by someone changing their mind or being coerced, whereas legal/normative are more 'soft' -- e.g., icann et at deciding to drop .ru (though they didn't).

Multi-Stakeholder Administration is Hard

Further to Issue #33, I would broaden out this section. In our community we tend to think of "multistakeholderism" as the RIRs and ICANN. But you mean something quite a bit more broad. How would FBs, Twitters, "Truth", and the like establish governance models to avoid abuse? You do mention the CAB Forum, which is excellent. But you bury that example. I think I would recommend either defining the term broadly and sticking CAB up front, followed by perhaps my (as of now) fictitious example, and then followed by more classic stuff like the DNS.

Inappropriate decentralisation?

Habermas can be read as arguing against over-decentralization; see comment.

This is briefly touched upon in 'avoid over-extensibility'. While extensibility allows distributed evolution, that can surface other power relationships -- e.g., a larger vendor is able to abuse their power.

Privacy

Some additional context for privacy considerations around distributed consensus here:
https://coinsights.substack.com/p/the-duality-of-web3

Capture Resistance

@darobin suggests this as a concept - might be more crisp than 'centralization risk'

Editorial nit: two parties that are not in direct contact

Should that be "two parties who are not in direct contact"?

What would you like to see by way of future work?

I think this document screams for a call-to-action and a summary. As you put such a thing together, it will help to observe an adage one of my friends lives by:

Say what you're going to say (abstract)
Say it (most of the body)
Say what you said (summary) and what you want to have happen next (call to action)

Reading list

Papers / sources to check:

Certificate authorities

Watson points out:

"Similarly, the need for coordination in the Web's trust model brings
centralization risk, because a Certificate Authority (CA) can control
communication between the Web sites that they sign certificates for
and users whose browsers trust the CA's root certificates."

This isn't quite the case; a CA can sign for any website (ignoring
CAA, etc.) and then intercept. However techniques like CT make it much
more visible.

Add examples

Throughout, but especially in the recommendations.

Counterbalances

There are a number of factors that argue against complete decentralisation -- or at least make it less attractive / more difficult. Discuss:¹

efficiency (many functions are inherently more efficient at higher scale. This may imply downward pressure on the number of providers of a given function)
complexity (disintermediated communication exposes users to more complexity - in particular, the choices they're required to make. This has a cognitive load.)
specialisation (having a function concentrated into relatively few hands can result in them having greater proficiency in providing it, whereas fully distributed it would be performed less well -- especially relevant to security and reliability)
privacy / hiding in a crowd (concentrating a function can protect against certain kinds of attacks / surveillance by removing the ability to distinguish the target)

Note that none are reasons to allow centralisation. May be good to expand upon the idea that centralization is a spectrum with various tensions pulling in each direction.

also in notes was 'collective action' -- worth pursuing? ↩

Extensibility and maintenance

The "Over-Extensibility" section points out the risks of designs that only specify a narrow core of functions and rely on extensibility to meet actual demand. That's a great point, but the text only addresses the initial design of the protocol. I wonder whether the text should be extended to something like "protocol gardening". Gardeners will shape trees as they grow by pruning them. Should standard bodies similarly bring extensions inside the core of a new revision, or prune (and prohibit) extensions that demonstrate too much centralization risks?

A whimsical point: GNS

4.2 necessary centralization and DNS cover good ground in your draft. Right now, draft-schanzen-gns exists. Here we have the PGP web of trust of naming. The value of this draft is that it attempts to provide a way around centralized authorities and is very relevant in a world where belligerent governments try to impose controls on DNS. They turn DNS on its head and make every resolver the root.

You are already hitting at one of the two weakpoints that draft has: how do you do that first rendezvous? The other weakpoint may also be worth mentioning here: without some sort of trusted introducer model, if keys are lost, that really is the ballgame. And those keys have to be properly managed. I asked the authors to undertake a thought experiment: what would it be like for FB to run GNS? You may want to touch on some of this as a complication.

Centralization Scales Governance

As you may have noticed, I've been going at it with EKR just a bit on interoperability versus interconnectivity over his discussion about Europe's Digital Markets Act. First, I highly recommend that you check out his and Steve Bellovin's views, as you may find them enlightening. This goes to my earlier issues in several ways:

Is the organizational model the one you want or might you take advantage of the timeliness of the DMA debate to think in different terms?
Your point about reputation and email is very good and important; and goes to this interconnectivity versus interoperability aspect. What is perhaps important is understanding the relationship between the protocol design and the governance model.

Let's take WhatsApp for example (this is what I mentioned to Eric). They have a way cool approach to abuse reporting: and endpoint sends a message in the clear to a central authority who evaluates it and puts people on the naughty step if it is warranted. It might be possible for WA to share those reports with other services. That happens with email with DMARC and ARF, fur instance. But the governance model is important as well. Would you automatically send an abuse report to Eliot's Bar and Grill IM service without first reviewing it? Would you accept an abuse report and take action from Eliot's Bar and Grill IM service without first reviewing it?

Obviously there are non-protocol aspects here, but the protocol developers need to recognize those as they are building out capabilities.

Underpinning for decentralization

More directly relate it to polycentricity, development, and governance overall.

compare with https://www.worldbank.org/en/topic/communitydrivendevelopment/brief/Decentralization

Discovery / defaults / etc.

Jari says:

[O]ne thing that is not discussed but perhaps should be is the role of discovery. We seem to have an increasing number of solutions that are built for relatively fixed linkages between OS - device - browser - application - network services. For instance, a default service lead to a situation where a vast majority of users will use the default service. From a standards perspective it would be better to build on discovery such that for instance local services can be used. This is of course not always easy, but I think it needs to be looked at, at least on a per-case basis.

This is most closely related to the 'Decentralise Proprietary Functions' section, but it's not obvious / related enough; more examples and perhaps reframing might help.

build affordances for regulation

should be a primary goal for standards efforts.

Spend a little time more time up front on centralization risk

Given that you use the term "centralization risk" throughout your draft, I think you should spend a bit more time defining it on first use. E.g., something like, the likelihood of most people using one or a small number of providers for the same service (or some such).

Externally-induced centralisation

Jari points out:

Section 4.4 inherited centralization — I’m not sure how much network-imposed limitations we have today, or in few years, given that in the western world at least much of the traffic is today encrypted. However, there are situations where government mandates or government - large internet content provider collaboration can create a setup where the user's choices for communication are severely restricted, and consequently, often centralized to few remaining ones. Perhaps you could talk a bit about this.

Linkage between the earlier sections and Section 6 is not explicit and a bit tenuous

Perhaps my biggest concern with the draft is this:

What can be remedied and what can't be remedied is somewhat lost because of a lack of clear causal linkage, as articulated between the sections. This is perhaps where I would spend the most time on the draft.

Try this: make a copy of the draft; remove Section 6, and then map examples of centralization you use in earlier sections to remediation approaches you think would help. Then take that text and see if it fits into the earlier sections. If you do this, you'll have solved the problem entirely.

Mention browsers?

I think they are worth highlighting. I think they illustrate rather well that increased complexity in a particular function might make it more likely for it to become centralized.

Redundancy / Distributed as a function of resistance against centralization

Not sure if it fits the scope of the document that I just read now. And probably not completely matured in my head too (rough thoughts)

One feature to reduce centralization is the redundancy and/or duplication of a feature. Geo-Distributed Multiple servers make it easier to avoid the control by another entity at a different or in the same layer to control the system. If server A is down (for any reasons: technical, political, economic, etc.), re-routing to server B becomes a function of resilience, specifically if server A and server B are not controlled by the same political or economic entity.

You are mentioning this a bit in the notion of federated protocol and also the issue #11 has an impact on this.
Aka open standard format of data frees the data of being captured by a single entity that making them processable in a different context, by multiple clients, on multiple networks. It also makes the "data redundant" for the lack of a better word, aka having multiple copies of them in different locations and contexts (as opposed to a single point of failure such as a social network message with a unique URI for accessing it).

As you said all of that doesn't prevent centralization, but it encourages the possibility of decentralization.

Discuss tradeoff of decentralization

e.g., large functions tend to be administered / operated more professionally, thus more available/performant/resistant to attack/have appropriate legal resource/etc.

However, if they are too large, they can take out a significant portion of the internet.

Depending on how difficult it is to switch providers / use multiple ones, different balance points can be found.