Fintech platform Synapse raises $33M to build ‘the AWS of banking’

Synapse, a San Francisco-based startup that operates a platform enabling banks and fintech companies to easily develop financial services, has closed a $33 million Series B to develop new products and go after international expansion.

The investment was led by Andreessen Horowitz with participation from existing backers Trinity Ventures and Core Innovation Capital . Synapse — which recently rebranded (slightly) from ‘SynapseFi’ — announced a $17 million Series A back in September 2018 so this deal takes it to $50 million raised to date.

The startup was founded in 2014 by Bryan Keltner and India-born CEO Sankaet Pathak, who came to the U.S. to study but grew frustrating at the difficulty of opening a bank account without U.S. social security history. Inspired by his struggles, Synapse, which operated under the radar prior to that Series A deal, is focused on democratizing financial services.

Its approach to doing that is a platform-based one that makes it easy for banks and other financial companies to work with developers. The current system for working with financial institutions is frankly a mess; it involves a myriad of different standards, interfaces, code bases and other compatibility issues that cause confusion and consume time. Through developer- and bank-facing APIs, Synapse aims to make it easier for companies to connect with banks, and, in turn, for banks to automate and extend their back-end operations.

Pathak previously told us the philosophy is a “Lego brick” approach to building services. Its modules and services include payment, deposit, lending, ID verification/KYC, card issuance and investment services.

“We want to make it super easy for developers to build and scale financial products and we want to do that across the spectrum of financial products,” he told TechCrunch in an interview this week.

Synapse CEO Sankaet Pathak

“We don’t think Bank Of America, Chase and Wells Fargo will be front and center” of new fintech, he added. “We want to make it really easy for internet companies to distribute financial services.”

The product development strategy is to add “pretty much anything that we think would be an accelerant to democratizing financial services for everyone,” he explained. “We want to make these tools and features available for developers.”

Interestingly, the company has a public product roadmap — the newest version is here.

The concept of an ‘operating system for banking’ is one that resonates with the kind of investment thesis associated with A16z, and Pathak said the firm was “number one” on his list of target VCs.

With more than half of that Series A round still in the bank, Pathak explained that the Series B is less about money and more around finding “a partner who can help us on the next phase, which is very focused on expansion.”

As part of the deal, Angela Strange A16z’s fintech and enterprise-focused general partner — has joined the startup’s board. Strange, whose portfolio includes Branch, described Synapse as “the AWS of banking” for its potential to let anyone build a fintech company, paralleling the way Amazon’s cloud services let anyone, anywhere develop and deploy a web service.

Having already found a product market fit in the U.S. — where its tech reaches nearly three million end users, with five million API requests daily — Synapse is looking overseas. The first focuses are Canada and Europe, which it plans to launch in before the end of the year with initial services including payments and deposits/debit card issuance. Subsequently, the plan is to add lending and investment products next year.

Members of the Synapse team

Further down the line, Pathak said he is eager to break into Asia and, potentially, markets in Latin America and Africa, although expansions aren’t likely until 2020 at the earliest. Once things pick up, though, the startup is aiming to enter two “key” markets per year alongside one “underserved” one.

“We’ve been preparing for [global expansion] for a while,” he said, pointing out that the startup has built key tech in-house, including computer vision capabilities.

“Our goal is to be in every country that’s not at war or under sanction from the U.S,” Pathak added.

At home, the company is looking to add a raft of new services for customers. That includes improvements and new features for card issuance, brokerage accounts, new areas for its loans product, more detailed KYC and identification and a chatbot platform.

Outside of product, the company is pushing to make its platform a self-service one to remove friction for developers who want to use Synapse services, and there are plans to launch a seed investment program that’ll help Synapse developer partners connect with investors. Interesting, the latter platform could see Synapse join investment rounds by offering credit for its services.

More generally on financial matters, the Synapse CEO said the company reached $12 million ARR last year. This year, he is aiming to double that number through growth that, he maintains, is sustainable.

“If we stop hiring, we could break even and be profitable in three to four months,” said Pathak. “I like to keep the burn like that… it stabilizes us as a company.”

London’s Tube network to switch on wi-fi tracking by default in July

Transport for London will roll out default wi-fi device tracking on the London Underground this summer, following a trial back in 2016.

In a press release announcing the move, TfL writes that “secure, privacy-protected data collection will begin on July 8” — while touting additional services, such as improved alerts about delays and congestion, which it frames as “customer benefits”, as expected to launch “later in the year”.

As well as offering additional alerts-based services to passengers via its own website/apps, TfL says it could incorporate crowding data into its free open-data API — to allow app developers, academics and businesses to expand the utility of the data by baking it into their own products and services.

It’s not all just added utility though; TfL says it will also use the information to enhance its in-station marketing analytics — and, it hopes, top up its revenues — by tracking footfall around ad units and billboards.

Commuters using the UK capital’s publicly funded transport network who do not want their movements being tracked will have to switch off their wi-fi, or else put their phone in airplane mode when using the network.

To deliver data of the required detail, TfL says detailed digital mapping of all London Underground stations was undertaken to identify where wi-fi routers are located so it can understand how commuters move across the network and through stations.

It says it will erect signs at stations informing passengers that using the wi-fi will result in connection data being collected “to better understand journey patterns and improve our services” — and explaining that to opt out they have to switch off their device’s wi-fi.

Attempts in recent years by smartphone OSes to use MAC address randomization to try to defeat persistent device tracking have been shown to be vulnerable to reverse engineering via flaws in wi-fi set-up protocols. So, er, switch off to be sure.

We covered TfL’s wi-fi tracking beta back in 2017, when we reported that despite claiming the harvested wi-fi data was “de-personalised”, and claiming individuals using the Tube network could not be identified, TfL nonetheless declined to release the “anonymized” data-set after a Freedom of Information request — saying there remains a risk of individuals being re-identified.

As has been shown many times before, reversing ‘anonymization’ of personal data can be frighteningly easy.

It’s not immediately clear from the press release or TfL’s website exactly how it will be encrypting the location data gathered from devices that authenticate to use the free wi-fi at the circa 260 wi-fi enabled London Underground stations.

Its explainer about the data collection does not go into any real detail about the encryption and security being used. (We’ve asked for more technical details.)

“If the device has been signed up for free Wi-Fi on the London Underground network, the device will disclose its genuine MAC address. This is known as an authenticated device,” TfL writes generally of how the tracking will work.

“We process authenticated device MAC address connections (along with the date and time the device authenticated with the Wi-Fi network and the location of each router the device connected to). This helps us to better understand how customers move through and between stations — we look at how long it took for a device to travel between stations, the routes the device took and waiting times at busy periods.”

“We do not collect any other data generated by your device. This includes web browsing data and data from website cookies,” it adds, saying also that “individual customer data will never be shared and customers will not be personally identified from the data collected by TfL”.

In a section entitled “keeping information secure” TfL further writes: “Each MAC address is automatically depersonalised (pseudonymised) and encrypted to prevent the identification of the original MAC address and associated device. The data is stored in a restricted area of a secure location and it will not be linked to any other data at a device level.  At no time does TfL store a device’s original MAC address.”

Privacy and security concerns were raised about the location tracking around the time of the 2016 trial — such as why TfL had used a monthly salt key to encrypt the data rather than daily salts, which would have decreased the risk of data being re-identifiable should it leak out.

Such concerns persist — and security experts are now calling for full technical details to be released, given TfL is going full steam ahead with a rollout.

 

A report in Wired suggests TfL has switched from hashing to a system of tokenisation – “fully replacing the MAC address with an identifier that cannot be tied back to any personal information”, which TfL billed as as a “more sophisticated mechanism” than it had used before. We’ll update as and when we get more from TfL.

Another question over the deployment at the time of the trial was what legal basis it would use for pervasively collecting people’s location data — since the system requires an active opt-out by commuters a consent-based legal basis would not be appropriate.

In a section on the legal basis for processing the Wi-Fi connection data, TfL writes now that its ‘legal ground’ is two-fold:

  • Our statutory and public functions
  • to undertake activities to promote and encourage safe, integrated, efficient and economic transport facilities and services, and to deliver the Mayor’s Transport Strategy

So, presumably, you can file ‘increasing revenue around adverts in stations by being able to track nearby footfall’ under ‘helping to deliver (read: fund) the mayor’s transport strategy’.

(Or as TfL puts it: “[T]he data will also allow TfL to better understand customer flows throughout stations, highlighting the effectiveness and accountability of its advertising estate based on actual customer volumes. Being able to reliably demonstrate this should improve commercial revenue, which can then be reinvested back into the transport network.”)

On data retention it specifies that it will hold “depersonalised Wi-Fi connection data” for two years — after which it will aggregate the data and retain those non-individual insights (presumably indefinitely, or per its standard data retention policies).

“The exact parameters of the aggregation are still to be confirmed, but will result in the individual Wi-Fi connection data being removed. Instead, we will retain counts of activities grouped into specific time periods and locations,” it writes on that.

It further notes that aggregated data “developed by combining depersonalised data from many devices” may also be shared with other TfL departments and external bodies. So that processed data could certainly travel.

Of the “individual depersonalised device Wi-Fi connection data”, TfL claims it is accessible only to “a controlled group of TfL employees” — without specifying how large this group of staff is; and what sort of controls and processes will be in place to prevent the risk of A) data being hacked and/or leaking out or B) data being re-identified by a staff member.

A TfL employee with intimate knowledge of a partner’s daily travel routine might, for example, have access to enough information via the system to be able to reverse the depersonalization.

Without more technical details we just don’t know. Though TfL says it worked with the UK’s data protection watchdog in designing the data collection with privacy front of mind.

“We take the privacy of our customers very seriously. A range of policies, processes and technical measures are in place to control and safeguard access to, and use of, Wi-Fi connection data. Anyone with access to this data must complete TfL’s privacy and data protection training every year,” it also notes elsewhere.

Despite holding individual level location data for two years, TfL is also claiming that it will not respond to requests from individuals to delete or rectify any personal location data it holds, i.e. if people seek to exercise their information rights under EU law.

“We use a one-way pseudonymisation process to depersonalise the data immediately after it is collected. This means we will not be able to single out a specific person’s device, or identify you and the data generated by your device,” it claims.

“This means that we are unable to respond to any requests to access the Wi-Fi data generated by your device, or for data to be deleted, rectified or restricted from further processing.”

Again, the distinctions it is making there are raising some eyebrows.

What’s amply clear is that the volume of data that will be generated as a result of a full rollout of wi-fi tracking across the lion’s share of the London Underground will be staggeringly massive.

More than 509 million “depersonalised” pieces of data, were collected from 5.6 million mobile devices during the four-week 2016 trial alone — comprising some 42 million journeys. And that was a very brief trial which covered a much smaller sub-set of the network.

As big data giants go, TfL is clearly gunning to be right up there.

When it comes to elections, Facebook moves slow, may still break things

This week, Facebook invited a small group of journalists — which didn’t include TechCrunch — to look at the “war room” it has set up in Dublin, Ireland, to help monitor its products for election-related content that violates its policies. (“Time and space constraints” limited the numbers, a spokesperson told us when he asked why we weren’t invited.)

Facebook announced it would be setting up this Dublin hub — which will bring together data scientists, researchers, legal and community team members, and others in the organization to tackle issues like fake news, hate speech and voter suppression — back in January. The company has said it has nearly 40 teams working on elections across its family of apps, without breaking out the number of staff it has dedicated to countering political disinformation. 

We have been told that there would be “no news items” during the closed tour — which, despite that, is “under embargo” until Sunday — beyond what Facebook and its executives discussed last Friday in a press conference about its European election preparations.

The tour looks to be a direct copy-paste of the one Facebook held to show off its US election “war room” last year, which it did invite us on. (In that case it was forced to claim it had not disbanded the room soon after heavily PR’ing its existence — saying the monitoring hub would be used again for future elections.)

We understand — via a non-Facebook source — that several broadcast journalists were among the invites to its Dublin “war room”. So expect to see a few gauzy inside views at the end of the weekend, as Facebook’s PR machine spins up a gear ahead of the vote to elect the next European Parliament later this month.

It’s clearly hoping shots of serious-looking Facebook employees crowded around banks of monitors will play well on camera and help influence public opinion that it’s delivering an even social media playing field for the EU parliament election. The European Commission is also keeping a close watch on how platforms handle political disinformation before a key vote.

But with the pan-EU elections set to start May 23, and a general election already held in Spain last month, we believe the lack of new developments to secure EU elections is very much to the company’s discredit.

The EU parliament elections are now a mere three weeks away, and there are a lot of unresolved questions and issues Facebook has yet to address. Yet we’re told the attending journalists were once again not allowed to put any questions to the fresh-faced Facebook employees staffing the “war room”.

Ahead of the looming batch of Sunday evening ‘war room tour’ news reports, which Facebook will be hoping contain its “five pillars of countering disinformation” talking points, we’ve compiled a run down of some key concerns and complications flowing from the company’s still highly centralized oversight of political campaigning on its platform — even as it seeks to gloss over how much dubious stuff keeps falling through the cracks.

Worthwhile counterpoints to another highly managed Facebook “election security” PR tour.

No overview of political ads in most EU markets

Since political disinformation created an existential nightmare for Facebook’s ad business with the revelations of Kremlin-backed propaganda targeting the 2016 US presidential election, the company has vowed to deliver transparency — via the launch of a searchable political ad archive for ads running across its products.

The Facebook Ad Library now shines a narrow beam of light into the murky world of political advertising. Before this, each Facebook user could only see the propaganda targeted specifically at them. Now, such ads stick around in its searchable repository for seven years. This is a major step up on total obscurity. (Obscurity that Facebook isn’t wholly keen to lift the lid on, we should add; Its political data releases to researchers so far haven’t gone back before 2017.)

However, in its current form, in the vast majority of markets, the Ad Library makes the user do all the leg work — running searches manually to try to understand and quantify how Facebook’s platform is being used to spread political messages intended to influence voters.

Facebook does also offer an Ad Library Report — a downloadable weekly summary of ads viewed and highest spending advertisers. But it only offers this in four countries globally right now: the US, India, Israel and the UK.

It has said it intends to ship an update to the reports in mid-May. But it’s not clear whether that will make them available in every EU country. (Mid-May would also be pretty late for elections that start May 23.)

So while the UK report makes clear that the new ‘Brexit Party’ is now a leading spender ahead of the EU election, what about the other 27 members of the bloc? Don’t they deserve an overview too?

A spokesperson we talked to about this week’s closed briefing said Facebook had no updates on expanding Ad Library Reports to more countries, in Europe or otherwise.

So, as it stands, the vast majority of EU citizens are missing out on meaningful reports that could help them understand which political advertisers are trying to reach them and how much they’re spending.

Which brings us to…

Facebook’s Ad Archive API is far too limited

In another positive step Facebook has launched an API for the ad archive that developers and researchers can use to query the data. However, as we reported earlier this week, many respected researchers have voiced disappointed with what it’s offering so far — saying the rate-limited API is not nearly open or accessible enough to get a complete picture of all ads running on its platform.

Following this criticism, Facebook’s director of product, Rob Leathern, tweeted a response, saying the API would improve. “With a new undertaking, we’re committed to feedback & want to improve in a privacy-safe way,” he wrote.

The question is when will researchers have a fit-for-purpose tool to understand how political propaganda is flowing over Facebook’s platform? Apparently not in time for the EU elections, either: We asked about this on Thursday and were pointed to Leathern’s tweets as the only update.

This issue is compounded by Facebook also restricting the ability of political transparency campaigners — such as the UK group WhoTargetsMe and US investigative journalism site ProPublica — to monitor ads via browser plug-ins, as the Guardian reported in January.

The net effect is that Facebook is making life hard for civil society groups and public interest researchers to study the flow of political messaging on its platform to try to quantify democratic impacts, and offering only a highly managed level of access to ad data that falls far short of the “political ads transparency” Facebook’s PR has been loudly trumpeting since 2017.

Ad loopholes remain ripe for exploiting

Facebook’s Ad Library includes data on political ads that were active on its platform but subsequently got pulled (made “inactive” in its parlance) because they broke its disclosure rules.

There are multiple examples of inactive ads for the Spanish far right party Vox visible in Facebook’s Ad Library that were pulled for running without the required disclaimer label, for example.

“After the ad started running, we determined that the ad was related to politics and issues of national importance and required the label. The ad was taken down,” runs the standard explainer Facebook offers if you click on the little ‘i’ next to an observation that “this ad ran without a disclaimer”.

What is not at all clear is how quickly Facebook acted to removed rule-breaking political ads.

It is possible to click on each individual ad to get some additional details. Here Facebook provides a per ad breakdown of impressions; genders, ages, and regional locations of the people who saw the ad; and how much was spent on it.

But all those clicks don’t scale. So it’s not possible to get an overview of how effectively Facebook is handling political ad rule breakers. Unless, well, you literally go in clicking and counting on each and every ad…

There is then also the wider question of whether a political advertiser that is found to be systematically breaking Facebook rules should be allowed to keep running ads on its platform.

Because if Facebook does allow that to happen there’s a pretty obvious (and massive) workaround for its disclosure rules: Bad faith political advertisers could simply keep submitting fresh ads after the last batch got taken down.

We were, for instance, able to find inactive Vox ads taken down for lacking a disclaimer that had still been able to rack up thousands — and even tens of thousands — of impressions in the time they were still active.

Facebook needs to be much clearer about how it handles systematic rule breakers.

Definition of political issue ads is still opaque

Facebook currently requires that all political advertisers in the EU go through its authorization process in the country where ads are being delivered if they relate to the European Parliamentary elections, as a step to try and prevent foreign interference.

This means it asks political advertisers to submit documents and runs technical checks to confirm their identity and location. Though it noted, on last week’s call, that it cannot guarantee this ID system cannot be circumvented. (As it was last year when UK journalists were able to successfully place ads paid for by ‘Cambridge Analytica’.)

One other big potential workaround is the question of what is a political ad? And what is an issue ad?

Facebook says these types of ads on Facebook and Instagram in the EU “must now be clearly labeled, including a paid-for-by disclosure from the advertiser at the top of the ad” — so users can see who is paying for the ads and, if there’s a business or organization behind it, their contact details, plus some disclosure about who, if anyone, saw the ads.

But the big question is how is Facebook defining political and issue ads across Europe?

While political ads might seem fairly easy to categorize — assuming they’re attached to registered political parties and candidates, issues are a whole lot more subjective.

Currently Facebook defines issue ads as those relating to “any national legislative issue of public importance in any place where the ad is being run.” It says it worked with EU barometer, YouGov and other third parties to develop an initial list of key issues — examples for Europe include immigration, civil and social rights, political values, security and foreign policy, the economy and environmental politics — that it will “refine… over time.”

Again specifics on when and how that will be refined are not clear. Yet ads that Facebook does not deem political/issue ads will slip right under its radar. They won’t be included in the Ad Library; they won’t be searchable; but they will be able to influence Facebook users under the perfect cover of its commercial ad platform — as before.

So if any maliciously minded propaganda slips through Facebook’s net, because the company decides it’s a non-political issue, it will once again leave no auditable trace.

In recent years the company has also had a habit of announcing major takedowns of what it badges “fake accounts” ahead of major votes. But again voters have to take it on trust that Facebook is getting those judgement calls right.

Facebook continues to bar pan-EU campaigns

On the flip side of weeding out non-transparent political propaganda and/or political disinformation, Facebook is currently blocking the free flow of legal pan-EU political campaigning on its platform.

This issue first came to light several weeks ago, when it emerged that European officials had written to Nick Clegg (Facebook’s vice president of global affairs) to point out that its current rules — i.e. that require those campaigning via Facebook ads to have a registered office in the country where the ad is running — run counter to the pan-European nature of this particular election.

It means EU institutions are in the strange position of not being able to run Facebook ads for their own pan-EU election everywhere across the region. “This runs counter to the nature of EU institutions. By definition, our constituency is multinational and our target audience are in all EU countries and beyond,” the EU’s most senior civil servants pointed out in a letter to the company last month.

This issue impacts not just EU institutions and organizations advocating for particular policies and candidates across EU borders, but even NGOs wanting to run vanilla “get out the vote” campaigns Europe-wide — leading to a number to accuse Facebook of breaching their electoral rights and freedoms.

Facebook claimed last week that the ball is effectively in the regulators’ court on this issue — saying it’s open to making the changes but has to get their agreement to do so. A spokesperson confirmed to us that there is no update to that situation, either.

Of course the company may be trying to err on the side of caution, to prevent bad actors being able to interfere with the vote across Europe. But at what cost to democratic freedoms?

What about fake news spreading on WhatsApp?

Facebook’s ‘election security’ initiatives have focused on political and/or politically charged ads running across its products. But there’s no shortage of political disinformation flowing unchecked across its platforms as user uploaded ‘content’.

On the Facebook-owned messaging app WhatsApp, which is hugely popular in some European markets, the presence of end-to-end encryption further complicates this issue by providing a cloak for the spread of political propaganda that’s not being regulated by Facebook.

In a recent study of political messages spread via WhatsApp ahead of last month’s general election in Spain, the campaign group Avaaz dubbed it “social media’s dark web” — claiming the app had been “flooded with lies and hate”.

Posts range from fake news about Prime Minister Pedro Sánchez signing a secret deal for Catalan independence to conspiracy theories about migrants receiving big cash payouts, propaganda against gay people and an endless flood of hateful, sexist, racist memes and outright lies,” it wrote. 

Avaaz compiled this snapshot of politically charged messages and memes being shared on Spanish WhatsApp by co-opting 5,833 local members to forward election-related content that they deemed false, misleading or hateful.

It says it received a total of 2,461 submissions — which is of course just a tiny, tiny fraction of the stuff being shared in WhatsApp groups and chats. Which makes this app the elephant in Facebook’s election ‘war room’.

What exactly is a war room anyway?

Facebook has said its Dublin Elections Operation Center — to give it its official title — is “focused on the EU elections”, while also suggesting it will plug into a network of global teams “to better coordinate in real time across regions and with our headquarters in California [and] accelerate our rapid response times to fight bad actors and bad content”.

But we’re concerned Facebook is sending out mixed — and potentially misleading — messages about how its election-focused resources are being allocated.

Our (non-Facebook) source told us the 40-odd staffers in the Dublin hub during the press tour were simultaneously looking at the Indian elections. If that’s the case, it does not sound entirely “focused” on either the EU or India’s elections. 

Facebook’s eponymous platform has 2.375 billion monthly active users globally, with some 384 million MAUs in Europe. That’s more users than in the US (243M MAUs). Though Europe is Facebook’s second-biggest market in terms of revenues after the US. Last quarter, it pulled in $3.65BN in sales for Facebook (versus $7.3BN for the US) out of $15BN overall.

Apart from any kind of moral or legal pressure that Facebook might have for running a more responsible platform when it comes to supporting democratic processes, these numbers underscore the business imperative that it has to get this sorted out in Europe in a better way.

Having a “war room” may sound like a start, but unfortunately Facebook is presenting it as an end in itself. And its foot-dragging on all of the bigger issues that need tackling, in effect, means the war will continue to drag on.

Microsoft extends its Cognitive Services with personalization service, handwriting recognition APIs and more

As part of its rather bizarre news dump before its flagship Build developer conference next week, Microsoft today announced a slew of new pre-built machine learning models for its Cognitive Services platform. These include an API for building personalization features, a form recognizer for automating data entry, a handwriting recognition API and an enhanced speech recognition service that focuses on transcribing conversations.

Maybe the most important of these new services is the Personalizer. There are few apps and web sites, after all, that aren’t looking to provide their users with personalized features. That’s difficult, in part, because it often involves building models based on data that sits in a variety of silos. With Personalizer, Microsoft is betting on reinforcement learning, a machine learning technique that doesn’t need the kind of labeled training data typically used in machine learning. Instead, the reinforcement agent constantly tries to find the best way to achieve a given goal based on what users do. Microsoft argues that it is the first company to offer a service like this and the company itself has been testing the services on its Xbox, where it saw a 40% increase in engagement with its content after it implemented this service.

The handwriting recognition API, or Ink Recognizer as it is officially called, can automatically recognize handwriting, common shapes and documents. That’s something Microsoft has long focused on as it developed its Windows 10 inking capabilities, so maybe it’s no surprise that it is now packaging this up as a cognitive service, too. Indeed, Microsoft Office 365 and Windows use exactly this service already, so we’re talking about a pretty robust system. With this new API, developers can now bring these same capabilities to their own applications, too.

Conversation Transcription does exactly what the name implies: it transcribes conversations and it’s part of Microsoft’s existing speech-to-text features in the Cognitive Services lineup. It can label different speakers, transcribe the conversation in real time and even handle crosstalk. It already integrates with Microsoft Teams and other meeting software.

Also new is the Form Recognizer, a new API that makes it easier to extract text and data from business forms and documents. This may not sound like a very exciting feature, but it solves a very common problem and the service needs only five samples to understand how to extract data and users don’t have to do any of the arduous manual labeling that’s often involved in building these systems.

Form Recognizer is also coming to cognitive services containers, which allow developers to take these models outside of Azure and to their edge devices. The same is true for the existing speech-to-text and text-to-speech services, as well as the existing anomaly detector.

In addition, the company also today announced that its Neural Text-to-Speech, Computer Vision Read and Text Analytics Named Entity Recognition APIs are now generally available.

Some of these existing services are also getting some feature updates, with the Neural Text-to-Speech service now supporting five voices, while the Computer Vision API can now understand more than 10,000 concepts, scenes and objects, together with 1 million celebrities, compared to 200,000 in a previous version (are there that many celebrities?).

Couchbase’s mobile database gets built-in ML and enhanced synchronization features

Couchbase, the company behind the eponymous NoSQL database, announced a major update to its mobile database today that brings some machine learning smarts, as well as improved synchronization features and enhanced stats and logging support to the software.

“We’ve led the innovation and data management at the edge since the release of our mobile database five years ago,” Couchbase’s VP of Engineering Wayne Carter told me. “And we’re excited that others are doing that now. We feel that it’s very, very important for businesses to be able to utilize these emerging technologies that do sit on the edge to drive their businesses forward, and both making their employees more effective and their customer experience better.”

The latter part is what drove a lot of today’s updates, Carter noted. He also believes that the database is the right place to do some machine learning. So with this release, the company is adding predictive queries to its mobile database. This new API allows mobile apps to take pre-trained machine learning models and run predictive queries against the data that is stored locally. This would allow a retailer to create a tool that can use a phone’s camera to figure out what part a customer is looking for.

To support these predictive queries, Couchbase mobile is also getting support for predictive indexes. “Predictive indexes allow you to create an index on prediction, enabling correlation of real-time predictions with application data in milliseconds,” Carter said. In many ways, that’s also the unique value proposition for bringing machine learning into the database. “What you really need to do is you need to utilize the unique values of a database to be able to deliver the answer to those real-time questions within milliseconds,” explained Carter.

The other major new feature in this release is delta synchronization, which allows businesses to push far smaller updates to the databases on their employees mobile devices. That’s because they only have to receive the information that changed instead of a full updated database. Carter says this was a highly requested feature but until now, the company always had to prioritize work on other components of Couchbase.

This is an especially useful feature for the company’s retail customers, a vertical where it has been quite successful. These users need to keep their catalogs up to data and quite a few of them supply their employees with mobile devices to help shoppers. Rumor has it that Apple, too, is a Couchbase user.

The update also includes a few new features that will be more of interest to operators, including advanced stats reporting and enhanced logging support.

 

Facebook accused of blocking wider efforts to study its ad platform

Facebook has been accused of blocking the ability of independent researchers to effectively study how political disinformation flows across its ad platform.

Adverts that the social network’s business is designed to monetize have — at the very least — the potential to influence people and push voters’ buttons, as the Cambridge Analytica Facebook data misuse scandal highlighted last year.

Since that story exploded into a major global scandal for Facebook, the company has faced a chorus of calls from policymakers on both sides of the Atlantic for increased transparency and accountability.

It has responded with lashings of obfuscation, misdirection and worse.

Among Facebook’s less controversial efforts to counter the threat that disinformation poses to its business are what it bills “election security” initiatives, such as identity checks for political advertisers. Even these efforts have looked hopelessly flat-footed, patchy and piecemeal in the face of concerned attempts to use its tools to amplify disinformation in markets around the world.

Perhaps more significantly — under amped up political pressure — Facebook has launched a searchable ad archive. And access to Facebook ad data certainly has the potential to let external researchers hold the company’s claims to account.

But only if access is not equally flat-footed, patchy and piecemeal, with the risk that selective access to ad data ends up being just as controlled and manipulated as everything else on Facebook’s platform.

So far Facebook’s efforts on this front continue to attract criticism for falling way short.

“the opposite of what they claim to be doing… “

The company opened access to an ad archive API last month, via which it provides rate-limited access to a keyword search tool that lets researchers query historical ad data. (Researchers first need to pass an identity check process and agree to the Facebook developer platform terms of service before they can access the API.)

However, a review of the tool by not-for-profit Mozilla rates the API as a lot of weak-sauce “transparency-washing” — rather than a good faith attempt to support public interest research that could genuinely help quantify the societal costs of Facebook’s ad business.

“The fact is, the API doesn’t provide necessary data. And it is designed in ways that hinders the important work of researchers, who inform the public and policymakers about the nature and consequences of misinformation,” it writes in a blog post, where it argues that Facebook’s ad API meets just two out of five minimum standards it previously set out — backed by a group of sixty academics, hailing from research institutions including Oxford University, the University of Amsterdam, Vrije Universiteit Brussel, Stiftung Neue Verantwortung and many more.

Instead of providing comprehensive political advertising content, as the experts argue a good open API must, Mozilla writes that “it’s impossible to determine if Facebook’s API is comprehensive, because it requires you to use keywords to search the database.”

“It does not provide you with all ad data and allow you to filter it down using specific criteria or filters, the way nearly all other online databases do. And since you cannot download data in bulk and ads in the API are not given a unique identifier, Facebook makes it impossible to get a complete picture of all of the ads running on their platform (which is exactly the opposite of what they claim to be doing),” it adds.

Facebook’s tool is also criticized for failing to provide targeting criteria and engagement information for ads — thereby making it impossible for researchers to understand what advertisers on its platform are paying the company to reach; as well as how effective (or otherwise) these Facebook ads might be.

This exact issue was raised with a number of Facebook executives by British parliamentarians last year, during the course of a multi-month investigation into online disinformation. At one point Facebook’s CTO was asked point-blank whether the company would be providing ad targeting data as part of planned political ad transparency measures — only to provide a fuzzy answer.

Of course there are plenty of reasons why Facebook might be reluctant to enable truly independent outsiders to quantify the efficacy of political ads on its platform and therefore, by extension, its ad business.

Including, of course, the specific scandalous example of the Cambridge Analytica data heist itself, which was carried out by an academic, called Dr. Aleksandr Kogan, then attached to Cambridge University, who used his access to Facebook’s developer platform to deploy a quiz app designed to harvest user data without (most) people’s knowledge or consent in order to sell the info to the disgraced digital campaign company (which worked on various U.S. campaigns, including the presidential campaigns of Ted Cruz and Donald Trump).

But that just highlights the scale of the problem of so much market power being concentrated in the hands of a single adtech giant that has zero incentives to voluntarily report wholly transparent metrics about its true reach and power to influence the world’s 2 billion+ Facebook users.

Add to that, in a typical crisis PR response to multiple bad headlines last year, Facebook repeatedly sought to paint Kogan as a rogue actor — suggesting he was not at all a representative sample of the advertiser activity on its platform.

So, by the same token, any effort by Facebook to tar genuine research as similarly risky rightly deserves a robust rebuttal. The historical actions of one individual, albeit yes an academic, shouldn’t be used as an excuse to shut the door to a respected research community.

“The current API design puts huge constraints on researchers, rather than allowing them to discover what is really happening on the platform,” Mozilla argues, suggesting the various limitations imposed by Facebook — including search-rate limits — means it could take researchers “months” to evaluate ads in a particular region or on a certain topic.

Again, from Facebook’s point of view, there’s plenty to be gained by delaying the release of any more platform usage skeletons from its bulging historical data closet. (The “historical app audit” it announced with much fanfare last year continues to trickle along at a disclosure pace of its own choosing.)

The two areas where Facebook’s API is given a tentative thumbs up by Mozilla is in providing access to up-to-date and historical data (the seven-year availability of the data is badged “pretty good”); and for the API being accessible to and shareable with the general public (at least once they’ve gone through Facebook’s identity confirm process).

Though in both cases Mozilla also cautions it’s still possible that further blocking tactics might emerge — depending on how Facebook supports/constrains access going forward.

It does not look entirely coincidental that the criticism of Facebook’s API for being “inadequate” has landed on the same day that Facebook has pushed out publicity about opening access to a database of URLs its users have linked to since 2017 — which is being made available to a select group of academics.

In that case, 60 researchers, drawn from 30 institutions, who have been chosen by the U.S.’ Social Science Research Council.

Notably the Facebook-selected research data set entirely skips past the 2016 U.S. presidential election, when Russian election propaganda infamously targeted hundreds of millions of U.S. Facebook voters.

The U.K.’s 2016 Brexit vote is also not covered by the January 2017 onwards scope of the data set.

Facebook does say it is “committed to advancing this important initiative,” suggesting it could expand the scope of the data set and/or who can access it at some unspecified future time.

It also claims “privacy and security” considerations are holding up efforts to release research data quicker.

“We understand many stakeholders are eager for data to be made available as quickly as possible,” it writes. “While we remain committed to advancing this important initiative, Facebook is also committed to taking the time necessary to incorporate the highest privacy protections and build a data infrastructure that provides data in a secure manner.”

In Europe, Facebook committed itself to supporting good faith, public interest research when it signed up to the European Commission’s Code of Practice on disinformation last year.

The EU-wide Code includes a specific commitment that platform signatories “empower the research community to monitor online disinformation through privacy-compliant access to the platforms’ data,” in addition to other actions such as tackling fake accounts and making political ads and issue-based ads more transparent.

However, here, too, Facebook appears to be using “privacy-compliance” as an excuse to water down the level of transparency that it’s offering to external researchers.

TechCrunch understands that, in private, Facebook has responded to concerns raised about its ad API’s limits by saying it cannot provide researchers with more fulsome data about ads — including the targeting criteria for ads — because doing so would violate its commitments under the EU’s General Data Protection Regulation (GDPR) framework.

That argument is of course pure “cakeism.” AKA Facebook is trying to have its cake and eat it where privacy and data protection is concerned.

In plainer English, Facebook is trying to use European privacy regulation to shield its business from deeper and more meaningful scrutiny. Yet this is the very same company — and here comes the richly fudgy cakeism — that elsewhere contends personal data its platform pervasively harvests on users’ interests is not personal data. (In that case Facebook has also been found allowing sensitive inferred data to be used for targeting ads — which experts suggest violates the GDPR.)

So, tl;dr, Facebook can be found seizing upon privacy regulation when it suits its business interests to do so — i.e. to try to avoid the level of transparency necessary for external researchers to evaluate the impact its ad platform and business has on wider society and democracy … yet argues against GDPR when the privacy regulation stands in the way of monetizing users’ eyeballs by stuffing them with intrusive ads targeted by pervasive surveillance of everyone’s interests.

Such contradictions have not at all escaped privacy experts.

“The GDPR in practice — not just Facebook’s usual weak interpretation of it — does not stop organisations from publishing aggregate information, such as which demographics or geographic areas saw or were targeted for certain adverts, where such data is not fine-grained enough to pick an individual out,” says Michael Veale, a research fellow at the Alan Turing Institute — and one of 10 researchers who co-wrote the Mozilla-backed guidelines for what makes an effective ad API.

“Facebook would require a lawful basis to do the aggregation for the purpose of publishing, which would not be difficult, as providing data to enable public scrutiny of the legality and ethics of data processing is a legitimate interest if I have ever seen one,” he also tells us. “Facebook constantly reuse data for different and unclearly related purposes, and so claiming they could legally not reuse data to put their own activities in the spotlight is, frankly, pathetic.

“Statistical agencies have long been familiar with techniques such as differential privacy which stop aggregated information leaking information about specific individuals. Many differential privacy researchers already work at Facebook, so the expertise is clearly there.”

“It seems more likely that Facebook doesn’t want to release information on targeting as it would likely embarrass [it] and their customers,” Veale adds. “It is also possible that Facebook has confidentiality agreements with specific advertisers who may be caught red-handed for practices that go beyond public expectations. Data protection law isn’t blocking the disinfecting light of transparency, Facebook is.”

Asked about the URL database that Facebook has released to selected researchers today, Veale says it’s a welcome step — while pointing to further limitations.

“It’s a good thing that Facebook is starting to work more openly on research questions, particularly those which might point to problematic use of this platform. The initial cohort appears to be geographically diverse, which is refreshing — although appears to lack any academics from Indian universities, far and away Facebook’s largest user base,” he says. “Time will tell whether this limited data set will later expand to other issues, and how much researchers are expected to moderate their findings if they hope for continued amicable engagement.”

“It’s very possible for Facebook to effectively cherry-pick data sets to try to avoid issues they know exist, but you also cannot start building a collaborative process on all fronts and issues. Time will tell how open the multinational wishes to be,” Veale adds.

We’ve reached out to Facebook for comment on the criticism of its ad archive API.

A new era for enterprise IT

Amidst the newly minted scooter unicorns, ebbs and flows of bitcoin investments and wagers on the price of Uber’s IPO, another trend has shaken up the tech industry: the explosion of enterprise software successes.

Bessemer notes that today there are 55 private companies valued at $1 billion or more compared to zero a decade ago. Proving this isn’t just private market hype, enterprise cloud companies have well-exceeded $500 billion in market cap and are on a path to hit $1 trillion in the next few years. Whether it’s the masterfully executed IPOs of Zoom and PagerDuty, and the imminent Slack IPO, or the mega funding rounds of companies like Asana, and Airtable, Front, and many others, the insatiable demand for enterprise cloud deals shows that the new era of IT is no longer a zero sum game.

Back when we started Box in 2005, we saw a disruption on the horizon that would change enterprise software as we knew it.Led by the same trends that were impacting the consumer internet — growth of mobile, faster web-browsers, more users connected online — combined with the advent of the cloud, enterprises in every industry are forced to transform in the digital age. But we could barely have imagined the scale of change to come.

A tipping point for best-of-breed IT

Today’s enterprise software market doesn’t look like the enterprise software of the past. For one, the market is much larger. Deploying software in the on-prem world required a team of highly trained professionals and a hefty budget. By lowering costs and and removing adoption hurdles, the cloud expanded the market from millions to billions of people globally and in turn, businesses are using more apps than ever before. In fact, Okta found in their latest Business @ Work Report that large enterprises are deploying 129 apps on average. It’s therefore no surprise that software spend is expected to reach more than $420 billion in 2019 as the shift to the cloud marches on.

With a market of that magnitude, enterprise IT no longer can be controlled by just a handful of vendors, as we saw in the 90’s. And what were once solved problems in a prior era of IT are now unsolved relative to rapidly changing user and buyer expectations in the cloud, leaving the door open for new disruptors to emerge and solve this problem better, faster, and with more focused visions.

Previously pesky problems like alerting ops teams to technical issues have turned into an entire platform for real-time operations, leading to PagerDuty’s $3 billion valuation in the process.

Everyone thought video conferencing was a tired market but Zoom proved that with extreme focus and simple user-experience its team could build a company worth over $15 billion. Atlassian has generated $25 billion in value by building a portfolio of modern development and IT tools that power a digital enterprise.

Slack has shown that real-time communication and workflow automation can be reinvented yet again. And making this approach work seamlessly are services like Okta, which is valued at $10 billion today.

In all of these cases, “best-of-breed” platforms are growing rapidly in their respective markets, with near limitless size and potential. And as processes for every team, department, business, and industry can now be digitized, and we’ll continue to see this play out in every category of technology.

If the move from mainframe and mini-computers to PC saw a 10X increase in applications and software, the move from PC to cloud and mobile will see an order of magnitude more.

From IT stacks to cloud ecosystems

We’ve reached a new era of enterprise software and companies are coming around to this model in droves.
What seemed unfathomable merely a decade ago is now becoming commonplace as Fortune 500 companies are mixing and matching best-in-class technologies — from upstarts to cloud mainstays like Salesforce, Workday, and ServiceNow — to power their business. But there’s still work to do.

To ensure customers get all the benefits of a best-of-breed cloud ecosystem, these tools must work together without requiring the customer to stitch systems together manually. Without interoperability and integration, enterprises will be left with siloed data, fragmented workflows and security gaps in the cloud. In a legacy world, the idea of deep integration between software stacks was great on paper, but near impossible in practice. As Larry Ellison described in Softwar, customers were left footing the bill for putting together independent technology themselves. But the rules have changed with today’s generation of API-native companies with open cultures and a deep focus on putting the customer first.

Notably, even the largest players — IBM, Microsoft, Google, Cisco, and others — have recognized this tectonic shift, a harbinger of what’s to come in the industry. Satya Nadella, in taking over Microsoft, recognized the power of partnerships in a world where IT spend would be growing exponentially, telling Wired:

…instead of viewing things as zero sum, let’s view things as, ‘Hey, what is it that we’re trying to get done? What is it that they’re trying to get done? Places where we can co-operate, let’s co-operate.’ And where we’re competing, we compete.

As Peter Sole, former head of the Research Board, points out, in this digital world we can no longer think about a few vendors owning layers in a stack but instead as an ecosystem of multiple services working together to deliver value to the entire network. The incumbents that successfully thrive in the digital age will be those that despite their scale, work and operate like the nimbler, customer-obsessed, more open disruptors.  And those that don’t will face a reckoning from customers that now have choice to go a different direction for the first time.

Gone are the days of monolithic IT stacks and zero sum thinking; this is the new normal. Welcome to a new era of enterprise IT.

The new new web

Over the last five years, almost everything about web development has changed. Oh, the old tech still works, your WordPress and Ruby On Rails sites still function just fine — but they’re increasingly being supplanted by radical new approaches. The contents of your browser are being sliced, diced, rendered, and processed in wholly new ways nowadays, and the state of art is currently in serious flux. What follows is a brief tour of what I like to call the New New Web:

Table of Contents

  1. Single-Page Apps
  2. Headless CMSes
  3. Static Site Generators
  4. The JAMStack
  5. Hosting and Serverlessness
  6. Summary

1. Single-Page Apps

These have become so much the norm — our web projects at HappyFunCorp are almost always single-page apps nowadays — that it’s easy to forget how new and radical they were when they first emerged, in the days of jQuery running in pages dynamically built from templates on the server.

Harry Potter, the Platform, and the Future of Niantic

What is Niantic? If they recognize the name, most people would rightly tell you it’s a company that makes mobile games, like Pokémon GO, or Ingress, or Harry Potter: Wizards Unite.

But no one at Niantic really seems to box it up as a mobile gaming company. Making these games is a big part of what the company does, yes, but the games are part of a bigger picture: They are a springboard, a place to figure out the constraints of what they can do with augmented reality today, and to figure out how to build the tech that moves it forward. Niantic wants to wrap their learnings back into a platform upon which others can build their own AR products, be it games or something else. And they want to be ready for whatever comes after smartphones.

Niantic is a bet on augmented reality becoming more and more a part of our lives; when that happens, they want to be the company that powers it.

This is Part 3 of our EC-1 series on Niantic, looking at its past, present, and potential future. You can find Part 1 here and Part 2 here. The reading time for this article is 24 minutes (6,050 words)

The platform play

After the absurd launch of Pokémon GO, everyone wanted a piece of the AR pie. Niantic got more pitches than they could take on, I’m told, as rights holders big and small reached out to see if the company might build something with their IP or franchise.

But Niantic couldn’t build it all. From art, to audio, to even just thinking up new gameplay mechanics, each game or project they took on would require a mountain of resources. What if they focused on letting these other companies build these sorts of things themselves?

That’s the idea behind Niantic’s Real World Platform. This platform is a key part of Niantic’s game plan moving forward, with the company having as many people working on the platform as it has on its marquee money maker, Pokémon GO.

There are tons of pieces that go into making things like GO or Ingress, and Niantic has spent the better part of the last decade figuring out how to make them all fit together. They’ve built the core engine that powers the games and, after a bumpy start with Pokémon GO’s launch, figured out how to scale it to hundreds of millions of users around the world. They’ve put the work into figuring out how to detect cheaters and spoofers and give them the boot. They’ve built a social layer, with systems like friendships and trade. They’ve already amassed that real-world location data that proved so challenging back when it was building Field Trip, with all of those real-world points of interest that now serve as portals and Pokéstops.

Niantic could help other companies with real-world events, too. That might seem funny after the mess that was the first Pokémon GO Fest (as detailed in Part II). But Niantic turned around, went back to the same city the next year, and pulled it off. That experience — that battle-testing — is valuable. Meanwhile, the company has pulled off countless huge Ingress events, and a number of Pokémon GO side events calledSafari Zones.” CTO Phil Keslin confirmed to me that event management is planned as part of the platform offering.

As Niantic builds new tech — like, say, more advanced AR or faster ways to sync AR experiences between devices — it’ll all get rolled into the platform. With each problem they solve, the platform offering would grow.

But first they need to prove that there’s a platform to stand on.

Harry Potter: Wizards Unite

Niantic’s platform, as it exists today, is the result of years of building their own games. It’s the collection of tools they’ve built and rebuilt along the way, and that already powers Ingress Prime and Pokémon GO. But to prove itself as a platform company, Niantic needs to show that they can do it again. That they can take these engines, these tools, and, working with another team, use them for something new.

Vizion.ai launches its managed Elasticsearch service

Setting up Elasticsearch, the open-source system that many companies large and small use to power their distributed search and analytics engines, isn’t the hardest thing. What is very hard, though, is to provision the right amount of resources to run the service, especially when your users’ demand comes in spikes, without overpaying for unused capacity. Vizion.ai’s new Elasticsearch Service does away with all of this by essentially offering Elasticsearch as a service and only charging its customers for the infrastructure they use.

Vizion’s service automatically scales up and down as needed. It’s a managed service and delivered as a SaaS platform that can support deployments on both private and public clouds, with full API compatibility with the standard Elastic stack that typically includes tools like Kibana for visualizing data, Beats for sending data to the service and Logstash for transforming the incoming data and setting up data pipelines. Users can easily create several stacks for testing and development, too, for example.

Vizion.ai GM and VP Geoff Tudor

“When you go into the AWS Elasticsearch service, you’re going to be looking at dozens or hundreds of permutations for trying to build your own cluster,” Vision.ai’s VP and GM Geoff Tudor told me. “Which instance size? How many instances? Do I want geographical redundancy? What’s my networking? What’s my security? And if you choose wrong, then that’s going to impact the overall performance. […] We do balancing dynamically behind that infrastructure layer.” To do this, the service looks at the utilization patterns of a given user and then allocates resources to optimize for the specific use case.

What Vizion has done here is take some of the work from its parent company Panzura, a multi-cloud storage service for enterprises that has plenty of patents around data caching, and applied it to this new Elasticsearch service.

There are obviously other companies that offer commercial Elasticsearch platforms already. Tudor acknowledges this, but argues that his company’s platform is different. With other products, he argues, you have to decide on the size of your block storage for your metadata upfront, for example, and you typically want SSDs for better performance, which can quickly get expensive. Thanks to Panzura’s IP, Vizion.ai is able to bring down the cost by caching recent data on SSDs and keeping the rest in cheaper object storage pools.

He also noted that the company is positioning the overall Vizion.ai service, with the Elasticsearch service as one of the earliest components, as a platform for running AI and ML workloads. Support for TensorFlow, PredictionIO (which plays nicely with Elasticsearch) and other tools is also in the works. “We want to make this an easy serverless ML/AI consumption in a multi-cloud fashion, where not only can you leverage the compute, but you can also have your storage of record at a very cost-effective price point.”