Europe takes another step towards copyright pre-filters for user generated content

In a key vote this morning the European Parliament’s legal affairs committee has backed the two most controversial elements of a digital copyright reform package — which critics warn could have a chilling effect on Internet norms like memes and also damage freedom of expression online.

In the draft copyright directive, Article 11; “Protection of press publications concerning online uses” — which targets news aggregator business models by setting out a neighboring right for snippets of journalistic content that requires a license from the publisher to use this type of content (aka ‘the link tax’, as critics dub it) — was adopted by a 13:12 majority of the legal committee.

While, Article 13; “Use of protected content by online content sharing service providers”, which makes platforms directly liable for copyright infringements by their users — thereby pushing them towards creating filters that monitor all content uploads with all the associated potential chilling affects (aka ‘censorship machines’) — was adopted by a 15:10 majority.

MEPs critical of the proposals have vowed to continue to oppose the measures, and the EU parliament will eventually need to vote as a whole.

EU Member State representatives in the EU Council will also need to vote on the reforms before the directive can become law. Though, as it stands, a majority of European governments appear to back the proposals.

European digital rights group EDRi, a long-standing critic of Article 13, has a breakdown of the next steps for the copyright directive here.

Derailing the proposals now essentially rests on whether enough MEPs can be convinced its politically expedient to do so — factoring in a timeline that includes the next EU parliament elections, in May 2019.

Last week, a coalition of original Internet architects, computer scientists, academics and supporters — including Sir Tim Berners-Lee, Vint Cerf, Bruce Schneier, Jimmy Wales and Mitch Kapor — penned an open letter to the European Parliament’s president to oppose Article 13, warning that while “well-intended” the requirement that Internet platforms perform automatic filtering of all content uploaded by users “takes an unprecedented step towards the transformation of the Internet from an open platform for sharing and innovation, into a tool for the automated surveillance and control of its users”.

“As creators ourselves, we share the concern that there should be a fair distribution of revenues from the online use of copyright works, that benefits creators, publishers, and platforms alike. But Article 13 is not the right way to achieve this,” they write in the letter.

“By inverting this liability model and essentially making platforms directly responsible for ensuring the legality of content in the first instance, the business models and investments of platforms large and small will be impacted. The damage that this may do to the free and open Internet as we know it is hard to predict, but in our opinions could be substantial.”

The Wikimedia Foundational also blogged separately, setting out some specific concerns about the impact that mandatory upload filters could have on Wikipedia.

“[A]ny sort of law which mandates the deployment of automatic filters to screen all uploaded content using AI or related technologies does not leave room for the types of community processes which have been so effective on the Wikimedia projects,” it warned last week. “As previously mentioned, upload filters as they exist today view content through a broad lens, that can miss a lot of the nuances which are crucial for the review of content and assessments of legality or veracity.”

More generally critics warn that expressive and creative remix formats like memes and GIFs — which have come to form an integral part of the rich communication currency of the Internet — will be at risk if the proposals become law…

Regarding Article 11, Europe already has experience experimenting with a neighboring right for news, after an ancillary copyright law was enacted in Germany in 2013. But local publishers ended up offering Google free consent to display their snippets after they saw traffic fall substantially when Google stopped showing their content rather than pay for using them.

Spain also enacted a similar law for publishers in 2014, but its implementation required publishers to charge for using their snippets — leading Google to permanently close its news aggregation service in the country.

Critics of this component of the digital copyright reform package also warn it’s unclear what kinds of news content will constitute a snippet, and thus fall under the proposal — even suggesting a URL including the headline of an article could fall foul of the copyright extension; ergo that the hyperlink itself could be in danger.

They also argue that an amendment giving Member States the flexibility to decide whether or not a snippet should be considered “insubstantial” (and thus freely shared) or not, does not clear up problems — saying it just risks causing fresh fragmentation across the bloc, at a time when the Commission is keenly pushing a so-called ‘Digital Single Market’ strategy.

“Instead of one Europe-wide law, we’d have 28,” warns Reda on that. “With the most extreme becoming the de-facto standard: To avoid being sued, international internet platforms would be motivated to comply with the strictest version implemented by any member state.”

Returning to Article 13, the EU’s executive, the Commission — the body responsible for drafting the copyright reforms — has also been pushing online platforms towards pre-filtering content as a mechanism for combating terrorist content, setting out a “one hour rule” for takedowns of this type of content earlier this year, for example.

But again critics of the copyright reforms argue it’s outrageously disproportionate to seek to apply the same measures that are being applied to try to clamp down on terrorist propaganda and serious criminal offenses like child exploitation to police copyright.

“For copyrighted content these automated tools simply undermine copyright exceptions. And they are not proportionate,” Reda told us last year. “We are not talking about violent crimes here in the way that terrorism or child abuse are. We’re talking about something that is a really widespread phenomenon and that’s dealt with by providing attractive legal offers to people. And not by treating them as criminals.”

In a statement reacting to the committee vote today, Monique Goyens, director general of European consumer rights group BEUC, warned: “The internet as we know it will change when platforms will need to systematically filter content that users want to upload. The internet will change from a place where consumers can enjoy sharing creations and ideas to an environment that is restricted and controlled. Fair remuneration for creators is important, but consumers should not be at the losing end.”

Goyens blamed the “pressure of the copyright industry” for scuppering “even modest attempts to modernise copyright law”.

“Today’s rules are outdated and patchy. It is high time that copyright laws take into account that consumers share and create videos, music and photos on a daily basis. The majority of MEPs failed to find a solution that would have benefitted consumers and creators,” she added.

Blockchain browser Brave starts opt-in testing of on-device ad targeting

Brave, an ad-blocking web browser with a blockchain-based twist, has started trials of ads that reward viewers for watching them — the next step in its ambitious push towards a consent-based, pro-privacy overhaul of online advertising.

Brave’s Basic Attention Token (BAT) is the underlying micropayments mechanism it’s using to fuel the model. The startup was founded in 2015 by former Mozilla CEO Brendan Eich, and had a hugely successful initial coin offering last year.

In a blog post announcing the opt-in trial yesterday, Brave says it’s started “voluntary testing” of the ad model before it scales up to additional user trials.

These first tests involve around 250 “pre-packaged ads” being shown to trial volunteers via a dedicated version of the Brave browser that’s both loaded with the ads and capable of tracking users’ browsing behavior.

The startup signed up Dow Jones Media Group as a partner for the trial-based ad content back in April.

People interested in joining these trials are being asked to contact its Early Access group — via community.brave.com.

Brave says the test is intended to analyze user interactions to generate test data for training its on-device machine learning algorithms. So while its ultimate goal for the BAT platform is to be able to deliver ads without eroding individual users’ privacy via this kind of invasive tracking, the test phase does involve “a detailed log” of browsing activity being sent to it.

Though Brave also specifies: “Brave will not share this information, and users can leave this test at any time by switching off this feature or using a regular version of Brave (which never logs user browsing data to any server).”

“Once we’re satisfied with the performance of the ad system, Brave ads will be shown directly in the browser in a private channel to users who consent to see them. When the Brave ad system becomes widely available, users will receive 70% of the gross ad revenue, while preserving their privacy,” it adds.

The key privacy-by-design shift Brave is working towards is moving ad targeting from a cloud-based ad exchange to the local device where users can control their own interactions with marketing content, and don’t have to give up personal data to a chain of opaque third parties (armed with hooks and data-sucking pipes) in order to do so.

Local device ad targeting will work by Brave pushing out ad catalogs (one per region and natural language) to available devices on a recurring basis.

“Downloading a catalog does not identify any user,” it writes. “As the user browses, Brave locally matches the best available ad from the catalog to display that ad at the appropriate time. Brave ads are opt-in and consent-based (disabled by default), and engineered to operate without leaking the user’s personal data from their device.”

It couches this approach as “a more efficient and direct opportunity to access user attention without the inherent liabilities and risks involved with large scale user data collection”.

Though there’s still a ways to go before Brave is in a position to prove out its claims — including several more testing phases.

Brave says it’s planning to run further studies later this month with a larger set of users that will focus on improving its user modeling — “to integrate specific usage of the browser, with the primary goal of understanding how behavior in the browser impacts when to deliver ads”.

“This will serve to strengthen existing modeling and data classification engines and to refine the system’s machine learning,” it adds.

After that it says it will start to expand user trials — “in a few months” — focusing testing on the impact of rewards in its user-centric ad system.

“Thousands of ads will be used in this phase, and users will be able to earn tokens for viewing and interacting with ads,” it says of that.

Brave’s initial goal is for users to be able to reward content producers via the utility BAT token stored in a payment wallet baked into the browser. The default distributes the tokens stored in a users’ wallet based on time spent on Brave-verified websites (though users can also make manual tips).

Though payments using BAT may also ultimately be able to do more.

Its roadmap envisages real ad revenue and donation flow fee revenue being generated via its system this year, and also anticipates BAT integration into “other apps based on open source & specs for greater ad buying leverage and publisher onboarding”.

Pew: Social media still growing in emerging markets but stalled elsewhere

Facebook founder Mark Zuckerberg’s (so far) five-year project to expand access to the Internet in emerging markets makes plenty of business sense when you look at the latest report by the Pew Research Center — which shows social media use has plateaued across developed markets but continues to rise in the developing world.

In 2015-16, roughly four-in-ten adults across the emerging nations surveyed by Pew said they used social networking sites, and as of 2017, a majority (53%) use social media. Whereas, over the same period, social media use has generally been flat in many of the advanced economies surveyed.

Internet use and smartphone ownership have also stayed level in developed markets over the same period vs rising in emerging economies.

Pew polled more than 40,000 respondents in 37 countries over a roughly three month period in February to May last year for this piece of research.

The results show how developing markets are of clear and vital importance for social behemoth Facebook as a means to eke continued growth out of its primary ~15-year-old platform — plus also for the wider suite of social products it’s acquired around that. (Pew’s research asked people about multiple different social media sites, with suggested examples being country-specific — though Facebook and Twitter were staples.)

Especially — as Pew also found — of those who use the internet, people in developing countries often turn out to be more likely than their counterparts in advanced economies to network via social platforms such as Facebook (and Twitter) .

Which in turn suggests there are major upsides for social platforms getting into an emerging Internet economy early enough to establish themselves as a go-to networking service.

This dynamic doubtless explains why Facebook has been so leaden in its response to some very stark risks attached to how its social products accelerate the spread and consumption of misinformation in some developing countries, such as Myanmar and India.

Pulling the plug on its social products in emerging markets essentially means pulling the plug on business growth.

Though, in the face of rising political risk attached to Facebook’s own business and growing controversies attached to various products it offers, the company has reportedly rowed back from offering its ‘Free Basics’ Internet.org package in more than half a dozen countries in recent months, according to analysis by The Outline.

In March, for example, the UN warned that Facebook’s platform was contributing to the spread of hate speech and ethnic violence in crisis-hit Myanmar.

The company has also faced specific questions from US and EU lawmakers about its activities in the country — with scrutiny on the company dialed up to 11 after a major global privacy scandal that broke this spring.

And, in recent months, Facebook policy staffers have had to spend substantial quantities of man-hours penning multi-page explanations for all sorts of aspects of the company’s operations to try to appease angry politicians. So it looks pretty safe to conclude that the days of Facebook being able to pass off Internet.org-fueled business expansion as a ‘humanitarian mission’ are well and truly done.

(Its new ‘humanitarian project’ is a new matchmaking feature — which really looks like an attempt to rekindle stalled growth in mature markets.)

Given how the social media usage gap is closing between developed vs developing countries’ there’s also perhaps a question mark over how much longer Facebook can generally rely on tapping emerging markets to pump its business growth.

Although Pew’s survey highlights some pretty major variations in usage even across developed markets, with social media being hugely popular in Northern America and the Middle East, for example, but more of a patchwork story in Europe where usage is “far from ubiquitous” — such as in Germany where 87% of people use the internet but less than half say they use social media.

Cultural barriers to social media addiction are perhaps rather harder for a multinational giant to defeat than infrastructure challenges or even economic barriers (though Facebook does not appear to be giving up on that front either).

Outside Europe, nations with still major growth potential on the social media front include India, Indonesia and nations in sub-Saharan Africa, according to the Pew research. And Internet access remains a major barrier to social growth in many of these markets.

“Across the 39 countries [surveyed], a median of 75% say they either use the internet occasionally or own a smartphone, our definition of internet use,” it writes. “In many advanced economies, nine-in-ten or more use the internet, led by South Korea (96%). Greece (66%) is the only advanced economy surveyed where fewer than seven-in-ten report using the internet. Conversely, internet use is below seven-in-ten in 13 of the 22 emerging and developing economies surveyed. Among these countries, it is lowest in India and Tanzania, at a quarter of the adult population. Regionally, internet use is lowest in sub-Saharan Africa, where a median of 41% across six countries use the internet. South Africa (59%) is the only country in the region where at least half the population is online.”

India, Indonesia and sub-Saharan Africa are also regions where Facebook has pushed its controversial Internet.org ‘free web’ initiative. Although India banned zero-rated mobile services in 2016 on net neutrality grounds. And Facebook now appears to be at least partially rowing back on this front itself in other markets.

In parallel, the company has also been working on a more moonshot-y solar-powered high altitude drone engineering to try to bring Internet access (and thus social media access) to remoter areas that lack a reliable Internet connection. Although this project remains experimental — and has yet to deliver any commercial services.

Pew’s research also found various digital divides persisting within the surveyed countries, related to age, education, income and in some cases gender still differentiating who uses the Internet and who does not; and who is active on social media and who is inactive.

Across the globe, for example, it found younger adults are much more likely to report using social media than their older counterparts.

While in some emerging and developing countries, men are much more likely to use social media  than women — in Tunisia, for example, 49% of men use social networking sites, compared with just 28% of women. Yet in advanced countries, it found social networking is often more popular among women.

Pew also found significant differences in social media use across other demographic groups: Those with higher levels of education and those with higher incomes were found to be more likely to use social network sites.

UK report warns DeepMind Health could gain ‘excessive monopoly power’

DeepMind’s foray into digital health services continues to raise concerns. The latest worries are voiced by a panel of external reviewers appointed by the Google-owned AI company to report on its operations after its initial data-sharing arrangements with the U.K.’s National Health Service (NHS) ran into a major public controversy in 2016.

The DeepMind Health Independent Reviewers’ 2018 report flags a series of risks and concerns, as they see it, including the potential for DeepMind Health to be able to “exert excessive monopoly power” as a result of the data access and streaming infrastructure that’s bundled with provision of the Streams app — and which, contractually, positions DeepMind as the access-controlling intermediary between the structured health data and any other third parties that might, in the future, want to offer their own digital assistance solutions to the Trust.

While the underlying FHIR (aka, fast healthcare interoperability resource) deployed by DeepMind for Streams uses an open API, the contract between the company and the Royal Free Trust funnels connections via DeepMind’s own servers, and prohibits connections to other FHIR servers. A commercial structure that seemingly works against the openness and interoperability DeepMind’s co-founder Mustafa Suleyman has claimed to support.

There are many examples in the IT arena where companies lock their customers into systems that are difficult to change or replace. Such arrangements are not in the interests of the public. And we do not want to see DeepMind Health putting itself in a position where clients, such as hospitals, find themselves forced to stay with DeepMind Health even if it is no longer financially or clinically sensible to do so; we want DeepMind Health to compete on quality and price, not by entrenching legacy position,” the reviewers write.

Though they point to DeepMind’s “stated commitment to interoperability of systems,” and “their adoption of the FHIR open API” as positive indications, writing: “This means that there is potential for many other SMEs to become involved, creating a diverse and innovative marketplace which works to the benefit of consumers, innovation and the economy.”

“We also note DeepMind Health’s intention to implement many of the features of Streams as modules which could be easily swapped, meaning that they will have to rely on being the best to stay in business,” they add. 

However, stated intentions and future potentials are clearly not the same as on-the-ground reality. And, as it stands, a technically interoperable app-delivery infrastructure is being encumbered by prohibitive clauses in a commercial contract — and by a lack of regulatory pushback against such behavior.

The reviewers also raise concerns about an ongoing lack of clarity around DeepMind Health’s business model — writing: “Given the current environment, and with no clarity about DeepMind Health’s business model, people are likely to suspect that there must be an undisclosed profit motive or a hidden agenda. We do not believe this to be the case, but would urge DeepMind Health to be transparent about their business model, and their ability to stick to that without being overridden by Alphabet. For once an idea of hidden agendas is fixed in people’s mind, it is hard to shift, no matter how much a company is motivated by the public good.”

We have had detailed conversations about DeepMind Health’s evolving thoughts in this area, and are aware that some of these questions have not yet been finalised. However, we would urge DeepMind Health to set out publicly what they are proposing,” they add. 

DeepMind has suggested it wants to build healthcare AIs that are capable of charging by results. But Streams does not involve any AI. The service is also being provided to NHS Trusts for free, at least for the first five years — raising the question of how exactly the Google-owned company intends to recoup its investment.

Google of course monetizes a large suite of free-at-the-point-of-use consumer products — such as the Android mobile operating system; its cloud email service Gmail; and the YouTube video sharing platform, to name three — by harvesting people’s personal data and using that information to inform its ad targeting platforms.

Hence the reviewers’ recommendation for DeepMind to set out its thinking on its business model to avoid its intentions vis-a-vis people’s medical data being viewed with suspicion.

The company’s historical modus operandi also underlines the potential monopoly risks if DeepMind is allowed to carve out a dominant platform position in digital healthcare provision — given how effectively its parent has been able to turn a free-for-OEMs mobile OS (Android) into global smartphone market OS dominance, for example.

So, while DeepMind only has a handful of contracts with NHS Trusts for the Streams app and delivery infrastructure at this stage, the reviewers’ concerns over the risk of the company gaining “excessive monopoly power” do not seem overblown.

They are also worried about DeepMind’s ongoing vagueness about how exactly it works with its parent Alphabet, and what data could ever be transferred to the ad giant — an inevitably queasy combination when stacked against DeepMind’s handling of people’s medical records.

“To what extent can DeepMind Health insulate itself against Alphabet instructing them in the future to do something which it has promised not to do today? Or, if DeepMind Health’s current management were to leave DeepMind Health, how much could a new CEO alter what has been agreed today?” they write.

“We appreciate that DeepMind Health would continue to be bound by the legal and regulatory framework, but much of our attention is on the steps that DeepMind Health have taken to take a more ethical stance than the law requires; could this all be ended? We encourage DeepMind Health to look at ways of entrenching its separation from Alphabet and DeepMind more robustly, so that it can have enduring force to the commitments it makes.”

Responding to the report’s publication on its website, DeepMind writes that it’s “developing our longer-term business model and roadmap.”

“Rather than charging for the early stages of our work, our first priority has been to prove that our technologies can help improve patient care and reduce costs. We believe that our business model should flow from the positive impact we create, and will continue to explore outcomes-based elements so that costs are at least in part related to the benefits we deliver,” it continues.

So it has nothing to say to defuse the reviewers’ concerns about making its intentions for monetizing health data plain — beyond deploying a few choice PR soundbites.

On its links with Alphabet, DeepMind also has little to say, writing only that: “We will explore further ways to ensure there is clarity about the binding legal frameworks that govern all our NHS partnerships.”

“Trusts remain in full control of the data at all times,” it adds. “We are legally and contractually bound to only using patient data under the instructions of our partners. We will continue to make our legal agreements with Trusts publicly available to allow scrutiny of this important point.”

“There is nothing in our legal agreements with our partners that prevents them from working with any other data processor, should they wish to seek the services of another provider,” it also claims in response to additional questions we put to it.

We hope that Streams can help unlock the next wave of innovation in the NHS. The infrastructure that powers Streams is built on state-of-the-art open and interoperable standards, known as FHIR. The FHIR standard is supported in the UK by NHS Digital, NHS England and the INTEROPen group. This should allow our partner trusts to work more easily with other developers, helping them bring many more new innovations to the clinical frontlines,” it adds in additional comments to us.

“Under our contractual agreements with relevant partner trusts, we have committed to building FHIR API infrastructure within the five year terms of the agreements.”

Asked about the progress it’s made on a technical audit infrastructure for verifying access to health data, which it announced last year, it reiterated the wording on its blog, saying: “We will remain vigilant about setting the highest possible standards of information governance. At the beginning of this year, we appointed a full time Information Governance Manager to oversee our use of data in all areas of our work. We are also continuing to build our Verifiable Data Audit and other tools to clearly show how we’re using data.”

So developments on that front look as slow as we expected.

The Google-owned U.K. AI company began its push into digital healthcare services in 2015, quietly signing an information-sharing arrangement with a London-based NHS Trust that gave it access to around 1.6 million people’s medical records for developing an alerts app for a condition called Acute Kidney Injury.

It also inked an MoU with the Trust where the pair set out their ambition to apply AI to NHS data sets. (They even went so far as to get ethical signs-off for an AI project — but have consistently claimed the Royal Free data was not fed to any AIs.)

However, the data-sharing collaboration ran into trouble in May 2016 when the scope of patient data being shared by the Royal Free with DeepMind was revealed (via investigative journalism, rather than by disclosures from the Trust or DeepMind).

None of the ~1.6 million people whose non-anonymized medical records had been passed to the Google-owned company had been informed or asked for their consent. And questions were raised about the legal basis for the data-sharing arrangement.

Last summer the U.K.’s privacy regulator concluded an investigation of the project — finding that the Royal Free NHS Trust had broken data protection rules during the app’s development.

Yet despite ethical questions and regulatory disquiet about the legality of the data sharing, the Streams project steamrollered on. And the Royal Free Trust went on to implement the app for use by clinicians in its hospitals, while DeepMind has also signed several additional contracts to deploy Streams to other NHS Trusts.

More recently, the law firm Linklaters completed an audit of the Royal Free Streams project, after being commissioned by the Trust as part of its settlement with the ICO. Though this audit only examined the current functioning of Streams. (There has been no historical audit of the lawfulness of people’s medical records being shared during the build and test phase of the project.)

Linklaters did recommend the Royal Free terminates its wider MoU with DeepMind — and the Trust has confirmed to us that it will be following the firm’s advice.

“The audit recommends we terminate the historic memorandum of understanding with DeepMind which was signed in January 2016. The MOU is no longer relevant to the partnership and we are in the process of terminating it,” a Royal Free spokesperson told us.

So DeepMind, probably the world’s most famous AI company, is in the curious position of being involved in providing digital healthcare services to U.K. hospitals that don’t actually involve any AI at all. (Though it does have some ongoing AI research projects with NHS Trusts too.)

In mid 2016, at the height of the Royal Free DeepMind data scandal — and in a bid to foster greater public trust — the company appointed the panel of external reviewers who have now produced their second report looking at how the division is operating.

And it’s fair to say that much has happened in the tech industry since the panel was appointed to further undermine public trust in tech platforms and algorithmic promises — including the ICO’s finding that the initial data-sharing arrangement between the Royal Free and DeepMind broke U.K. privacy laws.

The eight members of the panel for the 2018 report are: Martin Bromiley OBE; Elisabeth Buggins CBE; Eileen Burbidge MBE; Richard Horton; Dr. Julian Huppert; Professor Donal O’Donoghue; Matthew Taylor; and Professor Sir John Tooke.

In their latest report the external reviewers warn that the public’s view of tech giants has “shifted substantially” versus where it was even a year ago — asserting that “issues of privacy in a digital age are if anything, of greater concern.”

At the same time politicians are also gazing rather more critically on the works and social impacts of tech giants.

Although the U.K. government has also been keen to position itself as a supporter of AI, providing public funds for the sector and, in its Industrial Strategy white paper, identifying AI and data as one of four so-called “Grand Challenges” where it believes the U.K. can “lead the world for years to come” — including specifically name-checking DeepMind as one of a handful of leading-edge homegrown AI businesses for the country to be proud of.

Still, questions over how to manage and regulate public sector data and AI deployments — especially in highly sensitive areas such as healthcare — remain to be clearly addressed by the government.

Meanwhile, the encroaching ingress of digital technologies into the healthcare space — even when the techs don’t even involve any AI — are already presenting major challenges by putting pressure on existing information governance rules and structures, and raising the specter of monopolistic risk.

Asked whether it offers any guidance to NHS Trusts around digital assistance for clinicians, including specifically whether it requires multiple options be offered by different providers, the NHS’ digital services provider, NHS Digital, referred our question on to the Department of Health (DoH), saying it’s a matter of health policy.

The DoH in turn referred the question to NHS England, the executive non-departmental body which commissions contracts and sets priorities and directions for the health service in England.

And at the time of writing, we’re still waiting for a response from the steering body.

Ultimately it looks like it will be up to the health service to put in place a clear and robust structure for AI and digital decision services that fosters competition by design by baking in a requirement for Trusts to support multiple independent options when procuring apps and services.

Without that important check and balance, the risk is that platform dynamics will quickly dominate and control the emergent digital health assistance space — just as big tech has dominated consumer tech.

But publicly funded healthcare decisions and data sets should not simply be handed to the single market-dominating entity that’s willing and able to burn the most resource to own the space.

Nor should government stand by and do nothing when there’s a clear risk that a vital area of digital innovation is at risk of being closed down by a tech giant muscling in and positioning itself as a gatekeeper before others have had a chance to show what their ideas are made of, and before even a market has had the chance to form. 

Dixons Carphone discloses data breach affecting 5.9M payment cards, 105k of which were compromised

European electronics and telecoms retailer Dixons Carphone has revealed a hack of its systems in which the intruder/s attempted to compromise 5.9 million payment cards.

In a statement put out today it says a review of its systems and data unearthed the data breach. It also confirms it has informed the UK’s data watchdog the ICO, financial conduct regulator the FCA, and the police.

According to the company, the vast majority of the cards (5.8M) were protected by chip-and-PIN technology — and it says the data accessed in respect of these cards contains “neither pin codes, card verification values (CVV) nor any authentication data enabling cardholder identification or a purchase to be made”.

However around 105,000 of the accessed cards were non-EU issued, and lacked chip-and-PIN, and it says those cards have been compromised.

“As a precaution we immediately notified the relevant card companies via our payment provider about all these cards so that they could take the appropriate measures to protect customers. We have no evidence of any fraud on these cards as a result of this incident,” it writes.

In addition to payment cards, the intruders also accessed 1.2M records containing non-financial personal data — such as name, address or email address.

“We have no evidence that this information has left our systems or has resulted in any fraud at this stage. We are contacting those whose non-financial personal data was accessed to inform them, to apologise, and to give them advice on any protective steps they should take,” the company adds.

In a statement about the breach, Dixons Carphone chief executive, Alex Baldock, said: “We are extremely disappointed and sorry for any upset this may cause. The protection of our data has to be at the heart of our business, and we’ve fallen short here. We’ve taken action to close off this unauthorised access and though we have currently no evidence of fraud as a result of these incidents, we are taking this extremely seriously.

“We are determined to put this right and are taking steps to do so; we promptly launched an investigation, engaged leading cyber security experts, added extra security measures to our systems and will be communicating directly with those affected. Cyber crime is a continual battle for business today and we are determined to tackle this fast-changing challenge.”

The company does not reveal when its systems were compromised; nor exactly when it discovered the intrusion; nor how long it took to launch an investigation — writing only that: “As part of a review of our systems and data, we have determined that there has been unauthorised access to certain data held by the company. We promptly launched an investigation, engaged leading cyber security experts and added extra security measures to our systems. We have taken action to close off this access and have no evidence it is continuing. We have no evidence to date of any fraudulent use of the data as result of these incidents.”

New European data protection rules are very strict in respect of data breaches, requiring that data controllers report any security incidents where personal data has been lost, stolen or otherwise accessed by unauthorized third parties to their data protection authority within 72 hours of them becoming aware of it. (Or even sooner if the breach is likely to result in a “high risk of adversely affecting individuals’ rights and freedoms”.)

And failure to promptly disclosure breaches can attract major fines under the GDPR data protection framework.

Yesterday the ICO issued a £250k penalty for a Yahoo data breach dating back to 2014 — though that was under the UK’s prior data protection regime which capped fines at a maximum of £500k. Whereas under GDPR fines can scale up to 4% of a company’s global annual turnover (or €20M, whichever is greater).

We’ve reached out to the ICO for comment on the Dixons Carphone breach and will update this story with any response.

Carphone Warehouse, a mobile division of Dixons Carphone, also suffered a major hack in 2015 — and the company was fined £400k by the ICO in January for that data breach which affected around 3M people.

The company’s stock dropped around 5% this morning after it reported the latest breach, before recovering slightly but still down around 3.5% at the time of writing.

SoftBank Vision Fund leads $250M Series D for Cohesity’s hyperconverged data platform

San Jose-based Cohesity has closed an oversubscribed $250M Series D funding round led by SoftBank’s Vision Fund, bringing its total raised to date to $410M. The enterprise software company offers a hyperconverged data platform for storing and managing all the secondary data created outside of production apps.

In a press release today it notes this is only the second time SoftBank’s gigantic Vision Fund has invested in an enterprise software company. The fund, which is almost $100BN in size — without factoring in all the planned sequels, also led an investment in enterprise messaging company Slack back in September 2017 (also a $250M round).

Cohesity pioneered hyperconverged secondary storage as a first stepping stone on the path to a much larger transformation of enterprise infrastructure spanning public and private clouds. We believe that Cohesity’s web-scale Google-like approach, cloud-native architecture, and incredible simplicity is changing the business of IT in a fundamental way,” said Deep Nishar, senior managing partner at SoftBank Investment Advisers, in a supporting statement.

Also participating in the financing are Cohesity’s existing strategic investors Cisco Investments, Hewlett Packard Enterprise (HPE), and Morgan Stanley Expansion Capital, along with early investor Sequoia Capital and others.

The company says the investment will be put towards “large-scale global expansion” by selling more enterprises on the claimed cost and operational savings from consolidating multiple separate point solutions onto its hyperconverged platform. On the customer acquisition front it flags up support from its strategic investors, Cisco and HPE, to help it reach more enterprises.

Cohesity says it’s onboarded more than 200 new enterprise customers in the last two quarters — including Air Bud Entertainment, AutoNation, BC Oil and Gas Commission, Bungie, Harris Teeter, Hyatt, Kelly Services, LendingClub, Piedmont Healthcare, Schneider Electric, the San Francisco Giants, TCF Bank, the U.S. Department of Energy, the U.S. Air Force, and WestLotto — and says annual revenues grew 600% between 2016 and 2017.

In another supporting statement, CEO and founder Mohit Aron, added: “My vision has always been to provide enterprises with cloud-like simplicity for their many fragmented applications and data — backup, test and development, analytics, and more.

“Cohesity has built significant momentum and market share during the last 12 months and we are just getting started.”

Accenture wants to beat unfair AI with a professional toolkit

Next week professional services firm Accenture will be launching a new tool to help its customers identify and fix unfair bias in AI algorithms. The idea is to catch discrimination before it gets baked into models and can cause human damage at scale.

The “AI fairness tool”, as it’s being described, is one piece of a wider package the consultancy firm has recently started offering its customers around transparency and ethics for machine learning deployments — while still pushing businesses to adopt and deploy AI. (So the intent, at least, can be summed up as: ‘Move fast and don’t break things’. Or, in very condensed corporate-speak: “Agile ethics”.) 

“Most of last year was spent… understanding this realm of ethics and AI and really educating ourselves, and I feel that 2018 has really become the year of doing — the year of moving beyond virtue signaling. And moving into actual creation and development,” says Rumman Chowdhury, Accenture’s responsible AI lead — who joined the company when the role was created, in January 2017.

“For many of us, especially those of us who are in this space all the time, we’re tired of just talking about it — we want to start building and solving problems, and that’s really what inspired this fairness tool.”

Chowdhury says Accenture is defining fairness for this purpose as “equal outcomes for different people”. 

“There is no such thing as a perfect algorithm,” she says. “We know that models will be wrong sometimes. We consider it unfair if there are different degrees of wrongness… for different people, based on characteristics that should not influence the outcomes.”

She envisages the tool having wide application and utility across different industries and markets, suggesting early adopters are likely those in the most heavily regulated industries — such as financial services and healthcare, where “AI can have a lot of potential but has a very large human impact”.

“We’re seeing increasing focus on algorithmic bias, fairness. Just this past week we’ve had Singapore announce an AI ethics board. Korea announce an AI ethics board. In the US we already have industry creating different groups — such as The Partnership on AI. Google just released their ethical guidelines… So I think industry leaders, as well as non-tech companies, are looking for guidance. They are looking for standards and protocols and something to adhere to because they want to know that they are safe in creating products.

“It’s not an easy task to think about these things. Not every organization or company has the resources to. So how might we better enable that to happen? Through good legislation, through enabling trust, communication. And also through developing these kinds of tools to help the process along.”

The tool — which uses statistical methods to assess AI models — is focused on one type of AI bias problem that’s “quantifiable and measurable”. Specifically it’s intended to help companies assess the data sets they feed to AI models to identify biases related to sensitive variables and course correct for them, as it’s also able to adjust models to equalize the impact.

To boil it down further, the tool examines the “data influence” of sensitive variables (age, gender, race etc) on other variables in a model — measuring how much of a correlation the variables have with each other to see whether they are skewing the model and its outcomes.

It can then remove the impact of sensitive variables — leaving only the residual impact say, for example, that ‘likelihood to own a home’ would have on a model output, instead of the output being derived from age and likelihood to own a home, and therefore risking decisions being biased against certain age groups.

There’s two parts to having sensitive variables like age, race, gender, ethnicity etc motivating or driving your outcomes. So the first part of our tool helps you identify which variables in your dataset that are potentially sensitive are influencing other variables,” she explains. “It’s not as easy as saying: Don’t include age in your algorithm and it’s fine. Because age is very highly correlated with things like number of children you have, or likelihood to be married. Things like that. So we need to remove the impact that the sensitive variable has on other variables which we’re considering to be not sensitive and necessary for developing a good algorithm.”

Chowdhury cites an example in the US, where algorithms used to determine parole outcomes were less likely to be wrong for white men than for black men. “That was unfair,” she says. “People were denied parole, who should have been granted parole — and it happened more often for black people than for white people. And that’s the kind of fairness we’re looking at. We want to make sure that everybody has equal opportunity.”

However, a quirk of AI algorithms is that when models are corrected for unfair bias there can be a reduction in their accuracy. So the tool also calculates the accuracy of any trade-off to show whether improving the model’s fairness will make it less accurate and to what extent.

Users get a before and after visualization of any bias corrections. And can essentially choose to set their own ‘ethical bar’ based on fairness vs accuracy — using a toggle bar on the platform — assuming they are comfortable compromising the former for the latter (and, indeed, comfortable with any associated legal risk if they actively select for an obviously unfair tradeoff).

In Europe, for example, there are rules that place an obligation on data processors to prevent errors, bias and discrimination in automated decisions. They can also be required to give individuals information about the logic of an automated decision that effects them. So actively choosing a decision model that’s patently unfair would invite a lot of legal risk.

 

While Chowdhury concedes there is an accuracy cost to correcting bias in an AI model, she says trade-offs can “vary wildly”. “It can be that your model is incredibly unfair and to correct it to be a lot more fair is not going to impact your model that much… maybe by 1% or 2% [accuracy]. So it’s not that big of a deal. And then in other cases you may see a wider shift in model accuracy.”

She says it’s also possible the tool might raise substantial questions for users over the appropriateness of an entire data-set — essentially showing them that a data-set is “simply inadequate for your needs”.

“If you see a huge shift in your model accuracy that probably means there’s something wrong in your data. And you might need to actually go back and look at your data,” she says. “So while this tool does help with corrections it is part of this larger process — where you may actually have to go back and get new data, get different data. What this tool does is able to highlight that necessity in a way that’s easy to understand.

“Previously people didn’t have that ability to visualize and understand that their data may actually not be adequate for what they’re trying to solve for.”

She adds: “This may have been data that you’ve been using for quite some time. And it may actually cause people to re-examine their data, how it’s shaped, how societal influences influence outcomes. That’s kind of the beauty of artificial intelligence as a sort of subjective observer of humanity.”

While tech giants may have developed their own internal tools for assessing the neutrality of their AI algorithms — Facebook has one called Fairness Flow, for example — Chowdhury argues that most non-tech companies will not be able to develop their own similarly sophisticated tools for assessing algorithmic bias.

Which is where Accenture is hoping to step in with a support service — and one that also embeds ethical frameworks and toolkits into the product development lifecycle, so R&D remains as agile as possible.

“One of the questions that I’m always faced with is how do we integrate ethical behavior in way that aligns with rapid innovation. So every company is really adopting this idea of agile innovation and development, etc. People are talking a lot about three to six month iterative processes. So I can’t come in with an ethical process that takes three months to do. So part of one of my constraints is how do I create something that’s easy to integrate into this innovation lifecycle.”

One specific draw back is that currently the tool has not been verified working across different types of AI models. Chowdhury says it’s principally been tested on models that use classification to group people for the purposes of building AI models, so it may not be suitable for other types. (Though she says their next step will be to test it for “other kinds of commonly used models”.)

More generally, she says the challenge is that many companies are hoping for a magic “push button” tech fix-all for algorithmic bias. Which of course simply does not — and will not — exist.

“If anything there’s almost an overeagerness in the market for a technical solution to all their problems… and this is not the case where tech will fix everything,” she warns. “Tech can definitely help but part of this is having people understand that this is an informational tool, it will help you, but it’s not going to solve all your problems for you.”

The tool was co-prototyped with the help of a data study group at the UK’s Alan Turing Institute, using publicly available data-sets. 

During prototyping, when the researchers were using a German data-set relating to credit risk scores, Chowdhury says the team realized that nationality was influencing a lot of other variables. And for credit risk outcomes they found decisions were more likely to be wrong for non-German nationals.

They then used the tool to equalize the outcome and found it didn’t have a significant impact on the model’s accuracy. “So at the end of it you have a model that is just as accurate as the previous models were in determining whether or not somebody is a credit risk. But we were confident in knowing that one’s nationality did not have undue influence over that outcome.”

A paper about the prototyping of the tool will be made publicly available later this year, she adds.

Apple got even tougher on ad trackers at WWDC

Apple unveiled a handful of pro-privacy enhancements for its Safari web browser at its annual developer event yesterday, building on an ad tracker blocker it announced at WWDC a year ago.

The feature — which Apple dubbed ‘Intelligent Tracking Prevention’ (IPT) — places restrictions on cookies based on how frequently a user interacts with the website that dropped them. After 30 days of a site not being visited Safari purges the cookies entirely.

Since debuting IPT a major data misuse scandal has engulfed Facebook, and consumer awareness about how social platforms and data brokers track them around the web and erode their privacy by building detailed profiles to target them with ads has likely never been higher.

Apple was ahead of the pack on this issue and is now nicely positioned to surf a rising wave of concern about how web infrastructure watches what users are doing by getting even tougher on trackers.

Cupertino’s business model also of course aligns with privacy, given the company’s main money spinner is device sales. And features intended to help safeguard users’ data remain one of the clearest and most compelling points of differentiation vs rival devices running Google’s Android OS, for example.

“Safari works really hard to protect your privacy and this year it’s working even harder,” said Craig Federighi, Apple’s SVP of software engineering during yesterday’s keynote.

He then took direct aim at social media giant Facebook — highlighting how social plugins such as Like buttons, and comment fields which use a Facebook login, form a core part of the tracking infrastructure that follows people as they browse across the web.

In April US lawmakers also closely questioned Facebook’s CEO Mark Zuckerberg about the information the company gleans on users via their offsite web browsing, gathered via its tracking cookies and pixels — receiving only evasive answers in return.

Facebook subsequently announced it will launch a Clear History feature, claiming this will let users purge their browsing history from Facebook. But it’s less clear whether the control will allow people to clear their data off of Facebook’s servers entirely.

The feature requires users to trust that Facebook is doing what it claims to be doing. And plenty of questions remain. So, from a consumer point of view, it’s much better to defeat or dilute tracking in the first place — which is what the clutch of features Apple announced yesterday are intended to do.

“It turns out these [like buttons and comment fields] can be used to track you whether you click on them or not. And so this year we are shutting that down,” said Federighi, drawing sustained applause and appreciative woos from the WWDC audience.

He demoed how Safari will show a pop-up asking users whether or not they want to allow the plugin to track their browsing — letting web browsers “decide to keep your information private”, as he put it.

Safari will also immediately partition cookies for domains that Apple has “determined to have tracking abilities” — removing the 24 window after a website interaction that Apple allowed in the first version of IPT.

It has also engineered a feature designed to detect when a domain is solely used as a “first party bounce tracker” — i.e. meaning it is never used as a third party content provider but tracks the user purely through navigational redirects — with Safari also purging website data in such instances.

Another pro-privacy enhancement detailed by Federighi yesterday is intended to counter browser fingerprinting techniques that are also used to track users from site to site — and which can be a way of doing so even when/if tracking cookies are cleared.

“Data companies are clever and relentless,” he said. “It turns out that when you browse the web your device can be identified by a unique set of characteristics like its configuration, its fonts you have installed, and the plugins you might have installed on a device.

“With Mojave we’re making it much harder for trackers to create a unique fingerprint. We’re presenting websites with only a simplified system configuration. We show them only built-in fonts. And legacy plugins are no longer supported so those can’t contribute to a fingerprint. And as a result your Mac will look more like everyone else’s Mac and will it be dramatically more difficult for data companies to uniquely identify your device and track you.”

In a post detailing IPT 2.0 on its WebKit developer blog, Apple security engineer John Wilander writes that Apple researchers found that cross-site trackers “help each other identify the user”.

“This is basically one tracker telling another tracker that ‘I think it’s user ABC’, at which point the second tracker tells a third tracker ‘Hey, Tracker One thinks it’s user ABC and I think it’s user XYZ’. We call this tracker collusion, and ITP 2.0 detects this behavior through a collusion graph and classifies all involved parties as trackers,” he explains, warning developers they should therefore “avoid making unnecessary redirects to domains that are likely to be classified as having tracking ability” — or else risk being mistaken for a tracker and penalized by having website data purged.

ITP 2.0 will also downgrade the referrer header of a webpage that a tracker can receive to “just the page’s origin for third party requests to domains that the system has classified as possible trackers and which have not received user interaction” (Apple specifies this is not just a visit to a site but must include an interaction such as a tap/click).

Apple gives the example of a user visiting ‘https://store.example/baby-products/strollers/deluxe-navy-blue.html’, and that page loading a resource from a tracker — which prior to ITP 2.0 would have received a request containing the full referrer (which contains details of the exact product being bought and from which lots of personal information can be inferred about the user).

But under ITP 2.0, the referrer will be reduced to just “https://store.example/”. Which is a very clear privacy win.

Another welcome privacy update for Mac users that Apple announced yesterday — albeit, it’s really just playing catch-up with Windows and iOS — is expanded privacy controls in Mojave around the camera and microphone so it’s protected by default for any app you run. The user has to authorize access, much like with iOS.

Not just another decentralized web whitepaper?

Given all the hype and noise swirling around crypto and decentralized network projects, which runs the full gamut from scams and stupidity, to very clever and inspired ideas, the release of yet another whitepaper does not immediately set off an attention klaxon.

But this whitepaper — which details a new protocol for achieving consensus within a decentralized network — is worth paying more attention to than most.

MaidSafe, the team behind it, are also the literal opposite of fly-by-night crypto opportunists. They’ve been working on decentralized networking since long before the space became the hot, hyped thing it is now.

Their overarching mission is to engineer an entirely decentralized Internet which bakes in privacy, security and freedom of expression by design — the ‘Safe’ in their planned ‘Safe Network’ stands for ‘Secure access for everyone’ — meaning it’s encrypted, autonomous, self-organizing, self-healing. And the new consensus protocol is just another piece towards fulfilling that grand vision.

What’s consensus in decentralized networking terms? “Within decentralized networks you must have a way of the network agreeing on a state — such as can somebody access a file or confirming a coin transaction, for example — and the reason you need this is because you don’t have a central server to confirm all this to you,” explains MaidSafe’s COO Nick Lambert, discussing what the protocol is intended to achieve.

“So you need all these decentralized nodes all reaching agreement somehow on a state within the network. Consensus occurs by each of these nodes on the network voting and letting the network as a whole know what it thinks of a transaction.

“It’s almost like consensus could be considered the heart of the networks. It’s required for almost every event in the network.”

We wrote about MaidSafe’s alternative, server-less Internet in 2014. But they actually began work on the project in stealth all the way back in 2006. So they’re over a decade into the R&D at this point.

The network is p2p because it’s being designed so that data is locally encrypted, broken up into pieces and then stored distributed and replicated across the network, relying on the users’ own compute resources to stand in and take the strain. No servers necessary.

The prototype Safe Network is currently in an alpha testing stage (they opened for alpha in 2016). Several more alpha test stages are planned, with a beta release still a distant, undated prospect at this stage. But rearchitecting the entire Internet was clearly never going to be a day’s work.

MaidSafe also ran a multimillion dollar crowdsale in 2014 — for a proxy token of the coin that will eventually be baked into the network — and did so long before ICOs became a crypto-related bandwagon that all sorts of entities were jumping onto. The SafeCoin cryptocurrency is intended to operate as the inventive mechanism for developers to build apps for the Safe Network and users to contribute compute resource and thus bring MaidSafe’s distributed dream alive.

Their timing on the token sale front, coupled with prudent hodling of some of the Bitcoins they’ve raised, means they’re essentially in a position of not having to worry about raising more funds to build the network, according to Lambert.

A rough, back-of-an-envelope calculation on MaidSafe’s original crowdsale suggests, given they raised $2M in Bitcoin in April 2014 when the price for 1BTC was up to around $500, the Bitcoins they obtained then could be worth between ~$30M-$40M by today’s Bitcoin prices — though that would be assuming they held on to most of them. Bitcoin’s price also peaked far higher last year too.

As well as the token sale they also did an equity raise in 2016, via the fintech investment platform bnktothefuture, pulling in around $1.7M from that — in a mixture of cash and “some Bitcoin”.

“It’s gone both ways,” says Lambert, discussing the team’s luck with Bitcoin. “The crowdsale we were on the losing end of Bitcoin price decreasing. We did a raise from bnktothefuture in autumn of 2016… and fortunately we held on to quite a lot of the Bitcoin. So we rode the Bitcoin price up. So I feel like the universe paid us back a little bit for that. So it feels like we’re level now.”

“Fundraising is exceedingly time consuming right through the organization, and it does take a lot of time away from what you wants to be focusing on, and so to be in a position where you’re not desperate for funding is a really nice one to be in,” he adds. “It allows us to focus on the technology and releasing the network.”

The team’s headcount is now up to around 33, with founding members based at the HQ in Ayr, Scotland, and other engineers working remotely or distributed (including in a new dev office they opened in India at the start of this year), even though MaidSafe is still not taking in any revenue.

This April they also made the decision to switch from a dual licensing approach for their software — previously offering both an open source license and a commercial license (which let people close source their code for a fee) — to going only open source, to encourage more developer engagement and contributions to the project, as Lambert tells it.

“We always see the SafeNetwork a bit like a public utility,” he says. “In terms of once we’ve got this thing up and launched we don’t want to control it or own it because if we do nobody will want to use it — it needs to be seen as everyone contributing. So we felt it’s a much more encouraging sign for developers who want to contribute if they see everything is fully open sourced and cannot be closed source.”

MaidSafe’s story so far is reason enough to take note of their whitepaper.

But the consensus issue the paper addresses is also a key challenge for decentralized networks so any proposed solution is potentially a big deal — if indeed it pans out as promised.

 

Protocol for Asynchronous, Reliable, Secure and Efficient Consensus

MaidSafe reckons they’ve come up with a way of achieving consensus on decentralized networks that’s scalable, robust and efficient. Hence the name of the protocol — ‘Parsec’ — being short for: ‘Protocol for Asynchronous, Reliable, Secure and Efficient Consensus’.

They will be open sourcing the protocol under a GPL v3 license — with a rough timeframe of “months” for that release, according to Lambert.

He says they’ve been working on Parsec for the last 18 months to two years — but also drawing on earlier research the team carried out into areas such as conflict-free replicated data types, synchronous and asynchronous consensus, and topics such as threshold signatures and common coin.

More specifically, the research underpinning Parsec is based on the following five papers: 1. Baird L. The Swirlds Hashgraph Consensus Algorithm: Fair, Fast, Byzantine Fault Tolerance, Swirlds Tech Report SWIRLDS-TR-2016-01 (2016); 2. Mostefaoui A., Hamouna M., Raynal M. Signature-Free Asynchronous Byzantine Consensus with t <n/3 and O(n 2 ) Messages, ACM PODC (2014); 3. Micali S. Byzantine Agreement, Made Trivial, (2018); 4. Miller A., Xia Y., Croman K., Shi E., Song D. The Honey Badger of BFT Protocols, CCS (2016); 5. Team Rocket Snowflake to Avalanche: A Novel Metastable Consensus Protocol Family for Cryptocurrencies, (2018).

One tweet responding to the protocol’s unveiling just over a week ago wonders whether it’s too good to be true. Time will tell — but the potential is certainly enticing.

Bitcoin’s use of a drastically energy-inefficient ‘proof of work’ method to achieve consensus and write each transaction to its blockchain very clearly doesn’t scale. It’s slow, cumbersome and wasteful. And how to get blockchain-based networks to support the billions of transactions per second that might be needed to sustain the various envisaged applications remains an essential work in progress — with projects investigating various ideas and approaches to try to overcome the limitation.

MaidSafe’s network is not blockchain-based. It’s engineered to function with asynchronous voting of nodes, rather than synchronous voting, which should avoid the bottleneck problems associated with blockchain. But it’s still decentralized. So it needs a consensus mechanism to enable operations and transactions to be carried out autonomously and robustly. That’s where Parsec is intended to slot in.

The protocol does not use proof of work. And is able, so the whitepaper claims, to achieve consensus even if a third of the network is comprised of malicious nodes — i.e. nodes which are attempting to disrupt network operations or otherwise attack the network.

Another claimed advantage is that decisions made via the protocol are both mathematically guaranteed and irreversible.

“What Parsec does is it can reach consensus even with malicious nodes. And up to a third of the nodes being malicious is what the maths proofs suggest,” says Lambert. “This ability to provide mathematical guarantees that all parts of the network will come to the same agreement at a point in time, even with some fault in the network or bad actors — that’s what Byzantine Fault Tolerance is.”

In theory a blockchain using proof of work could be hacked if any one entity controlled 51% of the nodes on the network (although in reality it’s likely that such a large amount of energy would be required it’s pretty much impractical).

So on the surface MaidSafe’s decentralized network — which ‘only’ needs 33% of its nodes to be compromised for its consensus decisions to be attacked — sounds rather less robust. But Lambert says it’s more nuanced than the numbers suggest. And in fact the malicious third would also need to be nodes that have the authority to vote. “So it is a third but it’s a third of well reputed nodes,” as he puts it.

So there’s an element of proof of stake involved too, bound up with additional planned characteristics of the Safe Network — related to dynamic membership and sharding (Lambert says MaidSafe has additional whitepapers on both those elements coming soon).

“Those two papers, particularly the one around dynamic membership, will explain why having a third of malicious nodes is actually harder than just having 33% of malicious nodes. Because the nodes that can vote have to have a reputation as well. So it’s not just purely you can flood the Safe Network with lots and lots of malicious nodes and override it only using a third of the nodes. What we’re saying is the nodes that can vote and actually have a say must have a good reputation in the network,” he says.

“The other thing is proof of stake… Everyone is desperate to move away from proof of work because of its environmental impact. So proof of stake — I liken it to the Scottish landowners, where people with a lot of power have more say. In the cryptocurrency field, proof of stake might be if you have, let’s say, 10 coins and I have one coin your vote might be worth 10x as much authority as what my one coin would be. So any of these mechanisms that they come up with it has that weighting to it… So the people with the most vested interests in the network are also given the more votes.”

Sharding refers to closed groups that allow for consensus votes to be reached by a subset of nodes on a decentralized network. By splitting the network into small sections for consensus voting purposes the idea is you avoid the inefficiencies of having to poll all the nodes on the network — yet can still retain robustness, at least so long as subgroups are carefully structured and secured.

“If you do that correctly you can make it more secure and you can make things much more efficient and faster,” says Lambert. “Because rather than polling, let’s say 6,000 nodes, you might be polling eight nodes. So you can get that information back quickly.

“Obviously you need to be careful about how you do that because with much less nodes you can potentially game the network so you need to be careful how you secure those smaller closed groups or shards. So that will be quite a big thing because pretty much every crypto project is looking at sharding to make, certainly, blockchains more efficient. And so the fact that we’ll have something coming out in that, after we have the dynamic membership stuff coming out, is going to be quite exciting to see the reaction to that as well.”

Voting authority on the Safe Network might be based on a node’s longevity, quality and historical activity — so a sort of ‘reputation’ score (or ledger) that can yield voting rights over time.

“If you’re like that then you will have a vote in these closed groups. And so a third of those votes — and that then becomes quite hard to game because somebody who’s then trying to be malicious would need to have their nodes act as good corporate citizens for a time period. And then all of a sudden become malicious, by which time they’ve probably got a vested stake in the network. So it wouldn’t be possible for someone to just come and flood the network with new nodes and then be malicious because it would not impact upon the network,” Lambert suggests.

The computing power that would be required to attack the Safe Network once it’s public and at scale would also be “really, really significant”, he adds. “Once it gets to scale it would be really hard to co-ordinate anything against it because you’re always having to be several hundred percent bigger than the network and then have a co-ordinated attack on it itself. And all of that work might get you to impact the decision within one closed group. So it’s not even network wide… And that decision could be on who accesses one piece of encrypted shard of data for example… Even the thing you might be able to steal is only an encrypted shard of something — it’s not even the whole thing.”

Other distributed ledger projects are similarly working on Asynchronous Byzantine Fault Tolerant (AFBT) consensus models, including those using directed acrylic graphs (DAGs) — another nascent decentralization technology that’s been suggested as an alternative to blockchain.

And indeed AFBT techniques predate Bitcoin, though MaidSafe says these kind of models have only more recently become viable thanks to research and the relative maturing of decentralized computing and data types, itself as a consequence of increased interest and investment in the space.

However in the case of Hashgraph — the DAG project which has probably attracted the most attention so far — it’s closed source, not open. So that’s one major difference with MaidSafe’s approach. 

Another difference that Lambert points to is that Parsec has been built to work in a dynamic, permissionless network environment (essential for the intended use-case, as the Safe Network is intended as a public network). Whereas he claims Hashgraph has only demonstrated its algorithms working on a permissioned (and therefore private) network “where all the nodes are known”.

He also suggests there’s a question mark over whether Hashgraph’s algorithm can achieve consensus when there are malicious nodes operating on the network. Which — if true — would limit what it can be used for.

“The Hashgraph algorithm is only proven to reach agreement if there’s no adversaries within the network,” Lambert claims. “So if everything’s running well then happy days, but if there’s any maliciousness or any failure within that network then — certainly on the basis of what’s been published — it would suggest that that algorithm was not going to hold up to that.”

“I think being able to do all of these things asynchronously with all of the mathematical guarantees is very difficult,” he continues, returning to the core consensus challenge. “So at the moment we see that we have come out with something that is unique, that covers a lot of these bases, and is a very good use for our use-case. And I think will be useful for others — so I think we like to think that we’ve made a paradigm shift or a vast improvement over the state of the art.”

 

Paradigm shift vs marginal innovation

Despite the team’s conviction that, with Parsec, they’ve come up with something very notable, early feedback includes some very vocal Twitter doubters.

For example there’s a lengthy back-and-forth between several MaidSafe engineers and Ethereum researcher Vlad Zamfir — who dubs the Parsec protocol “overhyped” and a “marginal innovation if that”… so, er, ouch.

Lambert is, if not entirely sanguine, then solidly phlegmatic in the face of a bit of initial Twitter blowback — saying he reckons it will take more time for more detailed responses to come, i.e. allowing for people to properly digest the whitepaper.

“In the world of async BFT algorithms, any advance is huge,” MaidSafe CEO David Irvine also tells us when we ask for a response to Zamfir’s critique. “How huge is subjective, but any advance has to be great for the world. We hope others will advance Parsec like we have built on others (as we clearly state and thank them for their work).  So even if it was a marginal development (which it certainly is not) then I would take that.”

“All in all, though, nothing was said that took away from the fact Parsec moves the industry forward,” he adds. “I felt the comments were a bit juvenile at times and a bit defensive (probably due to us not agreeing with POS in our Medium post) but in terms of the only part commented on (the coin flip) we as a team feel that part could be much more concrete in terms of defining exactly how small such random (finite) delays could be. We know they do not stop the network and a delaying node would be killed, but for completeness, it would be nice to be that detailed.”

A developer source of our own in the crypto/blockchain space — who’s not connected to the MaidSafe or Ethereum projects — also points out that Parsec “getting objective review will take some time given that so many potential reviewers have vested interest in their own project/coin”.

It’s certainly fair to say the space excels at public spats and disagreements. Researchers pouring effort into one project can be less than kind to rivals’ efforts. (And, well, given all the crypto Lambos at stake it’s not hard to see why there can be no love lost — and, ironically, zero trust — between competing champions of trustless tech.)

Another fundamental truth of these projects is they’re all busily experimenting right now, with lots of ideas in play to try and fix core issues like scalability, efficiency and robustness — often having different ideas over implementation even if rival projects are circling and/or converging on similar approaches and techniques.

“Certainly other projects are looking at sharding,” says Lambert. “So I know that Ethereum are looking at sharding. And I think Bitcoin are looking at that as well, but I think everyone probably has quite different ideas about how to implement it. And of course we’re not using a blockchain which makes that another different use-case where Ethereum and Bitcoin obviously are. But everyone has — as with anything — these different approaches and different ideas.”

“Every network will have its own different ways of doing [consensus],” he adds when asked whether he believes Parsec could be adopted by other projects wrestling with the consensus challenge. “So it’s not like some could lift [Parsec] out and just put it in. Ethereum is blockchain-based — I think they’re looking at something around proof of stake, but maybe they could take some ideas or concepts from the work that we’re open sourcing for their specific case.

“If you get other blockchain-less networks like IOTA, Byteball, I think POA is another one as well. These other projects it might be easier for them to implement something like Parsec with them because they’re not using blockchain. So maybe less of that adaption required.”

Whether other projects will deem Parsec worthy of their attention remains to be seen at this point with so much still to play for. Some may prefer to expend effort trying to rubbish a rival approach, whose open source tech could, if it stands up to scrutiny and operational performance, reduce the commercial value of proprietary and patented mechanisms also intended to grease the wheels of decentralized networks — for a fee.

And of course MaidSafe’s developed-in-stealth consensus protocol may also turn out to be a relatively minor development. But finding a non-vested expert to give an impartial assessment of complex network routing algorithms conjoined to such a self-interested and, frankly, anarchical industry is another characteristic challenge of the space.

Irvine’s view is that DAG based projects which are using a centralized component will have to move on or adopt what he dubs “state of art” asynchronous consensus algorithms — as MaidSafe believes Parsec is — aka, algorithms which are “more widely accepted and proven”.

“So these projects should contribute to the research, but more importantly, they will have to adopt better algorithms than they use,” he suggests. “So they can play an important part, upgrades! How to upgrade a running DAG based network? How to had fork a graph? etc. We know how to hard fork blockchains, but upgrading DAG based networks may not be so simple when they are used as ledgers.

“Projects like Hashgraph, Algorand etc will probably use an ABFT algorithm like this as their whole network with a little work for a currency; IOTA, NANO, Bytball etc should. That is entirely possible with advances like Parsec. However adding dynamic membership, sharding, a data layer then a currency is a much larger proposition, which is why Parsec has been in stealth mode while it is being developed.

“We hope that by being open about the algorithm, and making the code open source when complete, we will help all the other projects working on similar problems.”

Of course MaidSafe’s team might be misguided in terms of the breakthrough they think they’ve made with Parsec. But it’s pretty hard to stand up the idea they’re being intentionally misleading.

Because, well, what would be the point of that? While the exact depth of MaidSafe’s funding reserves isn’t clear, Lambert doesn’t sound like a startup guy with money worries. And the team’s staying power cannot be in doubt — over a decade into the R&D needed to underpin their alt network.

It’s true that being around for so long does have some downsides, though. Especially, perhaps, given how hyped the decentralized space has now become. “Because we’ve been working on it for so long, and it’s been such a big project, you can see some negative feedback about that,” as Lambert admits.

And with such intense attention now on the space, injecting energy which in turn accelerates ideas and activity, there’s perhaps extra pressure on a veteran player like MaidSafe to be seen making a meaningful contribution — ergo, it might be tempting for the team to believe the consensus protocol they’ve engineered really is a big deal.

To stand up and be counted amid all the noise, as it were. And to draw attention to their own project — which needs lots of external developers to buy into the vision if it’s to succeed, yet, here in 2018, it’s just one decentralization project among so many. 

 

The Safe Network roadmap

Consensus aside, MaidSafe’s biggest challenge is still turning the sizable amount of funding and resources the team’s ideas have attracted to date into a bona fide alternative network that anyone really can use. And there’s a very long road to travel still on that front, clearly.

The Safe Network is in alpha 2 testing incarnation (which has been up and running since September last year) — consisting of around a hundred nodes that MaidSafe is maintaining itself.

The core decentralization proposition of anyone being able to supply storage resource to the network via lending their own spare capacity is not yet live — and won’t come fully until alpha 4.

“People are starting to create different apps against that network. So we’ve seen Jams — a decentralized music player… There are a couple of storage style apps… There is encrypted email running as well, and also that is running on Android,” says Lambert. “And we have a forked version of the Beaker browser — that’s the browser that we use right now. So if you can create websites on the Safe Network, which has its own protocol, and if you want to go and view those sites you need a Safe browser to do that, so we’ve also been working on our own browser from scratch that we’ll be releasing later this year… So there’s a number of apps that are running against that alpha 2 network.

“What alpha 3 will bring is it will run in parallel with alpha 2 but it will effectively be a decentralized routing network. What that means is it will be one for more technical people to run, and it will enable data to be passed around a network where anyone can contribute their resources to it but it will not facilitate data storage. So it’ll be a command line app, which is probably why it’ll suit technical people more because there’ll be no user interface for it, and they will contribute their resources to enable messages to be passed around the network. So secure messaging would be a use-case for that.

“And then alpha 4 is effectively bringing together alpha 2 and alpha 3. So it adds a storage layer on top of the alpha 3 network — and at that point it gives you the fully decentralized network where users are contributing their resources from home and they will be able to store data, send messages and things of that nature. Potentially during alpha 4, or a later alpha, we’ll introduce test SafeCoin. Which is the final piece of the initial puzzle to provide incentives for users to provide resources and for developers to make apps. So that’s probably what the immediate roadmap looks like.”

On the timeline front Lambert won’t be coaxed into fixing any deadlines to all these planned alphas. They’ve long ago learnt not to try and predict the pace of progress, he says with a laugh. Though he does not question that progress is being made.

“These big infrastructure projects are typically only government funded because the payback is too slow for venture capitalists,” he adds. “So in the past you had things like Arpanet, the precursor to the Internet — that was obviously a US government funded project — and so we’ve taken on a project which has, not grown arms and legs, but certainly there’s more to it than what was initially thought about.

“So we are almost privately funding this infrastructure. Which is quite a big scope, and I will say why it’s taking a bit of time. But we definitely do seem to be making lots of progress.”

Brexit blow for UK’s hopes of helping set AI rules in Europe

The UK’s hopes of retaining an influential role for its data protection agency in shaping European Union regulations post-Brexit — including helping to set any new Europe-wide rules around artificial intelligence — look well and truly dashed.

In a speech at the weekend in front of the International Federation for European Law, the EU’s chief Brexit negotiator, Michel Barnier, shot down the notion of anything other than a so-called ‘adequacy decision’ being on the table for the UK after it exits the bloc.

If granted, an adequacy decision is an EU mechanism for enabling citizens’ personal data to more easily flow from the bloc to third countries — as the UK will be after Brexit.

Such decisions are only granted by the European Commission after a review of a third country’s privacy standards that’s intended to determine that they offer essentially equivalent protections as EU rules.

But the mechanism does not allow for the third country to be involved, in any shape or form, in discussions around forming and shaping the EU’s rules themselves. So, in the UK’s case, the country would be going from having a seat at the rule-making table to being shut out of the process entirely — at time when the EU is really setting the global agenda on digital regulations.

“The United Kingdom decided to leave our harmonised system of decision-making and enforcement. It must respect the fact that the European Union will continue to work on the basis of this system, which has allowed us to build a single market, and which allows us to deepen our single market in response to new challenges,” said Barnier in Lisbon on Saturday.

“And, as indicated in the European Council guidelines, the UK must understand that the only possibility for the EU to protect personal data is through an adequacy decision. It is one thing to be inside the Union, and another to be outside.”

“Brexit is not, and never will be, in the interest of EU businesses,” he added. “And it will especially run counter to the interests of our businesses if we abandon our decision-making autonomy. This autonomy allows us to set standards for the whole of the EU, but also to see these standards being replicated around the world. This is the normative power of the Union, or what is often called ‘the Brussels effect’.

“And we cannot, and will not, share this decision-making autonomy with a third country, including a former Member State who does not want to be part of the same legal ecosystem as us.”

Earlier this month the UK’s Information Commissioner, Elizabeth Denham, told MPs on the UK parliament’s committee for exiting the European Union that a bespoke data agreement that gave the ICO a continued role after Brexit would be a far superior option to an adequacy agreement — pointing out that the UK stands to lose influence at a time when the EU is setting global privacy standards via the General Data Protection Regulation (GDPR), which came into full force last Friday.

“At this time when the GDPR is in its infancy, participating in shaping and interpreting the law I think is really important. And the group of regulators that sit around the table at the EU are the most influential blocs of regulators — and if we’re outside of that group and we’re an observer we’re not going to have the kind of effect that we need to have with big tech companies. Because that’s all going to be decided by that group of regulators,” she warned.

“The European Data Protection Board will set the weather when it comes to standards for artificial intelligence, for technologies, for regulating big tech. So we will be a less influential regulator, we will continue to regulate the law and protect UK citizens as we do now, but we won’t be at the leading edge of interpreting the GDPR — and we won’t be bringing British values to that table if we’re not at the table.”

She also pointed out that without a bespoke arrangement to accommodate the ICO her office would also be shut out of participating in the GDPR’s one-stop shop, which allows EU data protection agencies to work together and co-ordinate regulatory actions, and which she said “would bring huge advantages to both sides and also to British businesses”.

Huge advantages that the UK stands to lose as a result of Brexit.

With the ICO being excluded from participating in GDPR’s one-stop shop mechanism, it also means UK businesses will have to choose an alternative data protection agency within the EU to act as their lead regulator after Brexit — putting yet another burden on startups as they will need to build new relationships with a regulator in the EU.

The Irish Data Protection Commission seems the likely candidate for UK companies to look to after Brexit, when the ICO is on the side lines of GDPR, given shared language and proximity. (And Ireland’s DPC has been ramping up its headcount in anticipation of handling more investigations as a result of the new regulation.)

But UK businesses would clearly prefer to be able to continue working with their domestic regulator. Unfortunately, though, Brexit closes the door on that option.

We’ve reached out to the ICO for comment and will update this story with any response.

The UK government has committed to aligning the country with GDPR regardless of Brexit — as it seeks to avoid the economic threat of EU-UK data flows being cut off if it’s not judged to be providing adequate data protection.

Looking ahead that also essentially means the UK will need to keep its regulatory regime aligned with the EU’s in perpetuity — or risk being deemed inadequate, with, once again, the risk of data flows being cut of (or at very least businesses scrambling to put in place alternative legal arrangements to authorize their data flows, and saddled with the expense of doing so, as happened when Safe Harbor was struck down in 2015).

So, thanks to Brexit, it will be the rest of Europe setting the agenda on regulating AI — with the UK bound to follow.