Let's start with revealing some facts:
Facebook identity crisis
533 million Facebook users' phone numbers and personal data have been leaked online.
Security researchers say hackers could use the data to impersonate people and commit fraud.
Just to give you some idea how bad it is -- Facebook said that it was rebranding as Meta, taking a step to distance itself from the controversies it faces.
Countries passing new laws to make sure privacy of the users is the thing of the past
It's not news that some autocratic regimes (like Russia), prohibit anonymity by law.
In the wake of the protests, the authorities continued to reduce the space for online mobilization, arresting activists who organized online, banning the most prominent opposition groups, and labeling them as “extremist.” -- https://freedomhouse.org/country/russia/freedom-net/2021
The authorities in Russia now require privately owned messenger and social media apps to store and share information about users without a court order.
Contrary to autocratic regimes, the western world consumers are becoming more aware of PII issues.
All the concerns mentioned above exist due to one thing they all have in common -- the PII data is stored in centralized storage.
The conclusion that naturally follows from all this:
If there is no identity ever stored on the centralized server -- it can never be lost nor compromised.
Before we continue, let's answer the following question:
What is PII after all?
According to Guidance on the Protection of Personal Identifiable Information -- Personal Identifiable Information (PII) is defined as: Any representation of information that permits the identity of an individual to whom the information applies to be reasonably inferred by either direct or indirect means. ... It is the responsibility of the individual user to protect data to which they have access.
Yes, the definition given on the government site is a little dry, but what does it really mean in the layman's terms?
It simply states that, any information which can be used to figure our who you are -- is considered to be PII. For instance, if someone hacks in the database and steals 500k phone numbers -- that is not PII, because these numbers can eventually be guessed by anyone. Even if you know the number -- you don't know who it belongs to, as such, it carries little value to anyone who possesses this information. However, if someone steals 500k phone numbers with names of people who these numbers belong to and/or home addresses -- this information is PII and is super valuable for bad guys. When something like this happens -- it's a disaster of epic proportions.
PII today? How does it work?
There is no such thing as privacy on the internet these days any more.
Even the promise of crypto currencies for being able to conduct financial transactions online anonymously is rapidly eroding.
Crypto currency exchanges, like CoinBase, require users to not just reveal some PII data -- you must prove that you are who you say you are by uploading a copy of your state issued driver's license .
So, in order to do anything meaningful on the internet these days, you must reveal your PII to the 3rd party service providers in one form or another. Financial or Banking application may require more advanced forms of PII like SSN and/or copy of Driver License. Social Network providers will most likely ask you for your email address and/or mobile phone number (which is still PII by definition) so that they can send you a confirmation code, or allow you to recover your password if it's ever lost, or let you improve security by enabling multi factor authentication. The reality is, that all this information (no matter how insignificant it may seem) is PII, and the more providers you reveal it to, the more likely it is to be stolen or compromised.
Common sense security principle states, that the system is as secure as its least secure component. As such, exposing your PII data to more 3rd parties greatly increases a chance of the data being stolen. Who could have ever thought the data you provided to Facebook might be compromised and used by the bad guys against you? While Facebook may be the most famous example, here is the list of top 10 data leaks in 2021:
10. Android Users Data Leak — 100+ million
9. Thailand Visitors — 106+ million
8. Raychat — 150 million
7. Stripchat — 200 million
6. Socialarks — 214+ million
5. Brazilian Database — 223 million
4. Bykea — 400 million
3. Facebook — 553 million
2. LinkedIn — 700 million
1. Cognyte — 5 billion
No doubt, all these 3rd parties, always had only best intentions. They asked the users to Sign Up and Sign In so that the individuals can enjoy the best possible User Experience tailored specifically to them. They also ask for additional information, like your mobile phone number and/or email address in order to increase your protection on the internet by offering the multi factor authentication, so that the bad guys would have to jump through more hoops before they can start acting on your behalf. But, when your data is stolen (and it's not the question of "if", it's really the matter of time "when" it happens) -- nothing will stop the bad guys (see the top 10 list above).
Below is the the diagram depicting the centralize data storage model:
In this scenario the Centralized Storage is the single point of failure from security point of view.
The only way to prevent the data leaks is to make sure you never share the data with 3rd parties -- they can't loose what they don't have.
But this brings up an important question: can you let your friends know who you are, but remain anonymous to the rest of the world (including the 3rd party application providers). We believe the answer to this question is "yes".
Does the social network really need to know your name, your phone number, your email address to make your social experience superb? Not really.
Social Network services help us (the users) organize our social connections, so that we can exchange content and messages with our friends securely. The irony is that the Social Network providers really don't have use of your PII data -- they operate at the level of abstract record ID's,. They don't even need to know who you are to be able to serve paid Ads to individual devices, which, btw, is the main source of income for the SN providers. So, what if, instead of providing your PII data to 3rd parties, you only share it directly with the friends you trust, and still allow the 3rd parties to operate only at the level of abstract ID's (as they should).
To POC the idea of using centralized data storage for holding user generated content like images, messages, as well as users relationships, and combining it with direct sharing of User Identity and PII, we implemented application called What I Saw -- and it works extremely well.
While the application does not require users sign up and sign in, the friends securely exchange their identities with each other, which allows them to distinguish their friends content that is stored anonymously in the centralized cloud storage.
When you download and install the app on your mobile device (available on iOS and Android), you will see all the User Generated content as anonymous at first. However, when you establish the friendship relationship with people you trust -- you will be able to tell your friend's images and comments apart from the rest of the world.
The friendship request workflow
Here is step-by-step workflow of revealing your identity that is implemented in the POC:
You (User A) generate a friendship record in the database
Every mobile device has randomly assigned UUID (Universally Unique Identifier). This UUID is big enough so, it's virtually impossible to guess. But even if someone ever comes across some valid UUID -- it will be completely useless because it's not associated with any real user identity. In this step, you store your devices UUID in the friendship record in the centralized database.
You (User A) Direct Message (SMS) to your friend (User B) a deep link, which has encoded Friendship_UUID.
User B finds a Friendship record in the cloud database and stores UUID_B on the other side of the friendship relationship.
User A knows who the message was sent to in step 3. At this point User A can also retrieve UUID_B from a centralized database by Friendship_UUID (because it's already populated by User B). User A stores a record in local data storage which maps UUID_B to a real user name from Local Phone Book.
User B knows whom the friend request came from in step 3. So User B retrieves UUID_A from a centralized database by Friendship_UUID , and also stores a record in local data storage which maps UUID A to a real user name from Local Phone Book.
From now on, when ever User A browses user generated content, if the image was produced by a user with UUID_B, the application can retrieve real User B name from local storage map and present it accordingly as belonging to User B, rather than being anonymous. All the other pieces of content, which don't a corresponding entry in the local storage map of IDs, will continue to appear to the end user as anonymous.
But wait, you may say, that we still have to share the device's UUIDs to the server.
Even if bad guys gets access to the database, here is what they will see in the Friendship table for instance:
These number tell the following story:
there are some users in the application who are friends, but, you never will be able to guess who these users are in real life, because the PII is never to be found in the centralized database.
This approach is very different from the way social networks work today -- it may take time for the Internet users to gain necessary understanding and desired level of comfort before it becomes accepted by general public and takes mainstream. But, we believe this approach has a future, and, with time, will evolve in elaborate ways to interact online safely, securely, but still without compromising convenience of Modern User Experience.
WiSaw application is implemented as open source, anybody can validate any of the assumptions we described in this write up. See for yourself:
Mobile app for iOS and Android written in React-Native: https://github.com/echowaves/WiSaw
Backend hosted on AWS: https://github.com/echowaves/WiSaw.cdk
Σχόλια