Towards publicly-hosted applications

Towards publicly-hosted applications

It is often said that blockchain is a solution in search of a problem. Chromia is a response to problems that we have observed in the real world. This article intends to describe those problems and how we think it is possible to solve them. It is meant for a wide audience -- blockchain skeptics and enthusiasts, programmers and end-users alike.

Problems with centralized applications

Aside from a few peer-to-peer applications, all online services that you interact with today are "centralized". That is, they are hosted, run, and maintained by a single entity. In the most typical scenario, this entity is a for-profit company which desires to maximize its profits. This can be at odds with the desires and needs of its users and can be compared to the principal–agent problem: a service provider serves its users, but prioritizes its own interests even if they are in conflict with the interests of those users.

A for-profit company needs to address the needs of users -- without users, there are no profits. But such a company is motivated to provide as little as is necessary to efficiently monetize its user base. Users likely don't get the best service possible. The problems with this can be summarized as follows.

Lack of choice and features

A service owner does not need to cater to all of its users. In the case of services which have network effects (e.g. social media sites), users are often forced to accept things they find undesirable just to be able to access the network. For example, many people would rather pay for the service Facebook provides to avoid ads, but Facebook targets the majority of users who appreciate or tolerate the ad-supported service. Those who would prefer to pay are forced to accept this model or be excluded from the network. Facebook’s dominant position means that it is not motivated to provide a paid alternative to a minority of its users.

Information asymmetry

aerial photography of forest

Many services rely on the facts that users do not fully understand how the business operates. Services tend to provide as little information as required by law, and on top of that, users might not have time to read terms in full and analyze all possible consequences. For instance, most online services have suffered from a data breach at some point (Sony, Google, Facebook, Equifax, etc.), and the computer security experts they employ were likely aware that this was a risk if not an inevitability. The average user has an expectation that private data sent to a service will remain private, and it is in the service provider’s interest to keep users in the dark about the risk of data breaches.

Privacy violations

This is largely connected to information asymmetry. It is very common for online services to violate user privacy in a way which is detrimental to users. Users are generally not aware of how their data is used, and they have very little recourse if they are unhappy about the way in which their data is used, short of ceasing to use the service.

Censorship

Service providers are motivated to cater to the majority of their users. This means that when a user is suspected of some kind of abuse, there is an incentive to err on the side of caution and ban users merely on the suspicion of abuse. As long as the majority does no object, it is acceptable to unjustly exclude a minority of users. Services like YouTube and Facebook use bots to determine if certain content violates the terms of service which produces many false positives. While the use of a simple algorithm might be justified for initial screening, these services are sometimes overzealous and users may not have the option to dispute the decision. Human moderation is costly and adds little to revenues.

Reduced availability

Modern services have achieved very high levels of reliability but users still don't get the best deal possible. Service providers are motivated to spend as little as they can. In practice this means that the maximum tolerance for outages among users can effectively become the target service level. Further, data processing is often highly centralized as a way of guarding proprietary data and algorithms. As a result, service availability may be inferior to a more distributed architecture with a higher level of replication.

Monopolies and VC money

For-profit companies have long sought to build monopolies in order to maximise profit. Most of the great monopolies that have arisen throughout history have been brought to heel by regulation and antitrust legislation. However, this regulation did not anticipate the internet and its ability to transcend national borders (and national legislation), nor its powerful network effects and breakneck pace of development. These vast virgin markets are perfectly suited for monopolistic enterprises. This has not gone unnoticed by venture capitalists like Peter Thiel who has loudly and repeatedly claimed that “competition is for losers”.

While the monopolies of the 19th century drove prices up and wages down, the modern equivalents do exactly the opposite: lavish salaries, services that are free to the end user, AND huge profits. Isn’t that a win win? In some cases maybe it is. The problems enumerated in this article remain, but many users are completely happy with the arrangement. Of course there are a host of potential pitfalls like corporate surveillance intruding ever further into our lives, unforeseen effects on democracy and civic discourse, and bizarre security risks  which may well lead to some buyer’s remorse in the future.

Aside from all that, we believe in blockchain as a positive and equalising force in the world. It has been argued (by me) that the VC funding model places blockchain companies at a particular disadvantage because they are not well positioned to form monopolies, and cannot therefore offer the kind of massive medium-term returns that VCs gamble on. This leads to blockchain companies being grown, gutted, and sold rather than receiving farsighted and sustained investment. If blockchain is good for the consumer then, in this case, VCs are not.

Lack of interoperability

For-profit companies might seek to build monopolies as they are more profitable. Interoperability with rival systems typically increases the ability of users to move between services, destabilizing monopoly positions. For this reason, dominant companies deliberately avoid broad compatibility. The best example is probably the evolution of messaging softwares. Early protocols such as ICQ and MSN deliberately introduced breaking changes to disable alternative clients which offered users the convenience of connecting to multiple networks within one user interface. The XMPP protocol provided a standard for interoperability between compliant providers, including the now defunct Google Talk. After companies realized that their user bases were their most valuable resource, open standards were largely abandoned as they made it easier for users to migrate between platforms. It's hard to find a big-brand messenger which still supports XMPP, even though this protocol is adequate for all textual messaging needs. It's worth noting that even when Google Talk was still supported, Google made it hard to connect to normal non-branded XMPP servers.

Service discontinuation

gray concrete roadway
Photo by Matt Lamers / Unsplash

It is very common that a service which can potentially be sustainable is shut down because it doesn't generate enough profit. This might seem paradoxical, but VC-backed companies are typically expected to generate high return for investors. If a service is sustainable but does not generate significant profits VCs will push it to acquisition, after which the service is typically shut down. This also often happens when a large, mature company sees that a service it provides is not 'big enough' to be interesting. One of the prominent examples is Google Reader, which was a web feed aggregator provided by Google. It had a lot of users who found it very convenient, but was shut down when Google decided to focus on its core businesses. When a service is discontinued, users have to go through the hassle of finding an alternative (if it exists), moving their data, and becoming familiar with the new service.

These problems make it interesting to explore alternatives to centralized online services. We should not expect that some clever hack will make Google and Facebook irrelevant, but perhaps we can create an alternative which can compete with centralized services at least in some niches.

Existing alternatives

Services hosted by non-profits and governments

Non-profits can be more aligned with their user base as they do not prioritize profits. Some successful examples exist, e.g. Wikipedia. The problem is that ultimately the user base is at the mercy of the body which governs the non-profit. If that governance is benevolent and competent, everything is good. If it's not, the service might be ruined. End users have little to no opportunity to influence this outcome, and no guarantees that the service they signed up for will remain consistent in the future. A non-profit can also have significant overhead costs and no obvious way to support itself in the long term. The result is that there are not many online services run by non-profits.

Crowdfunding

One might think that if the public is financing the creation of a service, it might have control over its operations, but this doesn't seem to be the case, as is evident from multiple Kickstarter campaigns which ended with all the usual for-profit behavior. See the case of the Oculus Rift.

Open-source software

people walking on the road during day time

Open-source software has addressed some of the problems associated with proprietary software -- vendor lock-in, rent-seeking behavior, lack of interoperability -- but this only works for software which end users install on their own computer and does not require a shared online service. If an online service is required, somebody needs to host the service, leading naturally to the consequence that a single entity has a large influence on many people. Thus open source can be a part of the solution, but is not alone sufficient to address the problem.

Peer to peer software

Peer to peer (P2P) software aims to create an online service through the combined effort of end-user computers. P2P software became a big success in file sharing and content delivery, as this kind of workload is great for pure P2P. Content download can be trivially parallelized as any computer can download from any other, there is no great burden on peers as a peer can store as little as a single file fragment, and it is robust against malicious activity. In the worst case, a corrupt file fragment is served, which can be trivially detected. Most other applications can't be served in a pure P2P fashion, as there are concerns about the consistency of dynamically updated data, potential attacks, and skewed incentives.

Blockchain

Bitcoin demonstrated that a service as complex and important as a payment network can function as a peer to peer application. Strictly speaking Bitcoin isn't pure P2P, in practice we have several different classes of participants:

  • End-users using a wallet of some kind -- they can send payments without working as a part of the network
  • Full nodes -- most closely resembling 'peers' as they all are equal
  • Miners/mining pools -- special 'peers' which coordinate the network, typically using vast amounts of specialized hardware

Thus we see much more complex structure and specialization than in pure P2P networks such as Gnutella. Nevertheless, it shares certain similarities with older forms of P2P software, e.g. open participation and decentralized mode of operation.

Bitcoin extended the P2P approach with its use of cryptography, consensus and replication. This allows Bitcoin to thwart various kinds of attacks and maintain consistency which is crucial for a 'payment network'. Right after Bitcoin was announced, people started to ask questions: Can this approach be used to make other kinds of P2P/decentralized applications? People quickly figured out that it can, but trade-offs used in Bitcoin's design only make it directly relevant to a relatively narrow range of applications.

Bitcoin's design can be described as paranoid: it replicates data on a massive scale, makes no use of parallelization (e.g. no sharding), and makes write operations scarce and expensive. This kind of design is warranted when we deal with something like money (a leading theory says that Bitcoin is not so much a payment network as it is 'sound money'), but for few things otherwise.
The first service aside from money which was implemented using so-called blockchain, which is what people call Bitcoin-style P2P approach, was a decentralized name system -- Namecoin. It was successful as a concept, but not successful in terms of popularity. Perhaps a reason for that is that Namecoin itself is rather poorly engineered, it was created as a minimal modification of Bitcoin rather than something specifically designed to serve as a decentralized name system.

But what else can be built using the blockchain approach? As we can see with Ethereum, which generalizes approach of Bitcoin into a universal computation platform, the most popular applications are services related to money and tokens (which in terms of technical features behave like money). Other applications suffer from resource limits, which are themselves defined by the 'paranoid' nature of the design.

Publicly hosted application example

We are witnessing a trend as more and more alternatives to proprietary applications and services appear over time. What could be the next step which further widens the applicable range of decentralized applications?

There are two ways to think about it:

  1. Consider currently available technology and identify what adjustments can be used to broaden the scope
  2. Identify what kind of a solution we want, and then identify the necessary technical features

With Chromia we took the second approach.

Approaching a solution

The problem is that almost all online services are privately-hosted, and we believe that a public hosting option for online services would be advantageous for the user. In other words, we would like the users of a service (the public) to be in control of how that service is hosted, and by extension to have some control over various aspects of how that service is operated.

There are essentially two ways to implement this:

  • Single-tier Users themselves are responsible for hosting the application. This is     the traditional P2P approach. As was discussed earlier, this works only     for a narrow range of applications. In particular, it is poorly suited for   mobile applications. An overwhelming number of users do not like it when     an application drains their battery.
  • Multi-tier Users choose nodes which act as 'service providers' which provide the service. Service providers might receive compensation for their work or get some perks within the network.

The second approach might be compared to a P2P approach called 'supernodes'. In classic P2P (e.g. Skype) a supernode is a computer which is well-connected, e.g. it can be a home server with 24/7 internet connection and static IP address. Blockchain architectures might introduce more tiers:

  • Light/thin clients/nodes which aren't constantly online but need to connect to other nodes to perform their functions.
  • Full nodes.
  • Full nodes with indexing, e.g. Electrum servers in Bitcoin.
  • Masternodes which are specially recognized networks and serve certain trusted functions. This is present in Dash.
  • Block producers (mining pools, miners, stakers).

We would like to identify a public hosting alternative which is as close as possible to what private hosting can do. Thus single-tier approach can be ruled out as insufficiently flexible. Let's identify the different classes of participants which might connect to the services:

Ordinary users

Ordinary users are likely to use mobile or web-based clients which cannot be used for large-scale data processing. It's likely that the software will be active only when the user interacts with the application, and thus it cannot serve the needs of other users. It can, however, do minimal processing, such as preparing data to be presented to the user, it can do cryptographic operations such as signing and verifying signatures, and it can do verification of service integrity as long as this verification requires only a limited amount of resources (not proportional to the entire traffic of the application).

Power users

Power users might be interested in higher degrees of control, customizability, privacy, and they might run software on more powerful computers, such as home servers. They can play a bigger role in verifying the integrity of a system. In case of more demanding applications, power users cannot be expected to bear the entire burden of running the network and generally we expect the ratio of ordinary users to power users to be quite high.

Service providers

Service providers are paid for the service they provide. For example, their role might be to manage the database, answer queries, or to transform data. The difference between this and a normal hosting providing arrangement is that the service needs to be run in such a way that it is distributed among multiple providers, so that a single provider cannot hold user data hostage, and in such a way that the integrity of operations can be verified. In other words, service providers must be easily replaceable, and users should be in control of what to run and how, as they pay for the service.

There might be other roles to the system, such as developers and data providers. These are easy to address:  development of open source software can be commissioned using crowdfunding, data providers (who input useful information to the system) function similarly to service providers.

Thus what we need for this kind of a public hosting system to function is:

  1. A technical platform which allows a service to be run in such a way that data is replicated onto multiple providers, and this replication is done in a secure way so that one malfunctioning provider cannot take the whole system down. This is known as Byzantine Fault Tolerance (BFT), and practical algorithms are well known. Additionally, it is desirable that the integrity of the service be verifiable by external observers. This can be an automatic consequence of the BFT Replicated State Machine model where all computations are deterministic, and can therefore be replayed.
  2. A marketplace which is open to multiple service providers and allows end users to pay for the service and have some way to choose providers. This is something which can be relatively easily implemented using blockchains/smart contracts.

Thus it appears that all technical requirements can be satisfied using currently available technology. Of course, the devil is in the details. For example, the choice of providers responsible for running an application is a non-trivial governance problem. But a simple way to address this is to mimic the "hard forks" of a public blockchain. Imagine a system which allows multiple instances of a single application to exist, and allows users to migrate from one to another together with their data and remaining balance. In this case, the user can simply vote with their accounts; if they are not satisfied how Group A runs an application they can migrate to Group B. The mere possibility of this provides a strong incentive for providers to work diligently.

Can this decentralized architecture actually improve aspects such as privacy, unreasonable censorship, customizability? While we believe that the answer is "yes", there is much research yet to be done. We intend to explore these questions in more detail in future articles.

Publicly hosted application example

Although we have established that there are no significant technical impediments to creating true publicly hosted applications, it is still not clear what exactly could be run in such a way, and whether it is actually feasible. Let's consider a concrete example to make things more clear. Twitter has become an essential messaging platform for people all around the world. Important news is announced on Twitter, it is used by politicians, executives, and large companies. It is becoming as important as mail, and yet it is controlled by a single US company which can decide what posts to display, what to censor, what to prioritize, and who to ban. It is also hostile towards alternative clients.

Suppose a certain group of Twitter users cares about this so much that they would rather pay for a decentralized version of Twitter. How can they achieve this? To analyze feasibility, we first need to define some metrics. As we are interested to know if publicly hosted apps can work at scale, let's suppose the aforementioned group consists of a million people. Moreover, we need some kind of budget estimate, so let's assume each user is willing to pay 1 US dollar per year for this service -- less than the cost of a cup of coffee in many countries.

The group raises one million dollars between them. Experienced programmers can be hired for less than $10k/month, so have of that million can pay for 50 man-months, or 4 people working for a whole year. One might say "But Twitter has thousands of programmers!". Yes, but these programmers mostly work on things like serving ads, analytics for advertisers and so on. Our publicly hosted Twitter alternative has no need for this. If we had a platform which made public app programming about as easy as normal programming, and could scale, 50 man-months are enough to make something rather full-featured and impressive.

Now on the hosting side. Hardware nowadays is so powerful that a single server (which might cost about $100 per month) can easily serve the needs of a million users, assuming those requests are spread over time. With a budget of $500k, an application can pay 10 providers $4000 each. It seems that this calculation gives ample providers an ample profit margin.

"But wait, didn't Twitter initially have major scaling issues which took years to resolve? This can't be so simple can it?". Well, again, Twitter was solving a very different problem: they had to compute a list of tweets to show on a page on the server side. The server needed to reply to a request in 50-100 ms which made it unfeasible to execute a query for each subscription, thus Twitter had to perform multiple writes for each tweet to make it possible to serve pages quickly. Dwidder does not need to "serve pages quickly", in fact, it doesn't have to serve any pages at all. The list of messages displayed to a user can be computed on the client side and updated slowly over time. In other words, the modern web stack can greatly off-load the backend, moving parts of processing to the front-end.

Conclusion

Publicly hosted application infrastructure of the kind that we describe above, and which we are implementing with Chromia, posits a vision of a reformed internet which is truly aligned with the needs of their users. A more equitable relationship between dapp entrepreneurs and users can lead to an application ecosystem which is healthier in many ways, and which has accountability and transparency at its heart.

Unlike open-source or non-profit initiatives, this does not mean that there is no money to be made. Rather, this model uses technology to address bottlenecks in the current architecture of the internet which have enriched the few at the expense of the many. Without the ability to seize and control these bottlenecks, those who wish to profit must innovate and compete to design applications which deliver value to their users without exploiting them. This is our vision for publicly hosted applications on Chromia.