An argument against cloud-based applications

In the last decade we’ve seen massive changes in how we consume and interact with our world. The Yellow Pages is a concept that has to be meticulously explained with an impertinent scoff at our own age. We live within our smartphones, within our apps.

While we thrive with the information of the world at our fingertips, we casually throw away any semblance of privacy in exchange for the convenience of this world.

This line we straddle has been drawn with recklessness and calculation by big tech companies over the years as we’ve come to terms with what app manufacturers, large technology companies, and app stores demand of us.

Our private data into the cloud

According to Symantec, 89% of our Android apps and 39% of our iOS apps require access to private information. This risky use sends our data to cloud servers, to both amplify the performance of the application (think about the data needed for fitness apps) and store data for advertising demographics.

While large data companies would argue that data is not held for long, or not used in a nefarious manner, when we use the apps on our phones, we create an undeniable data trail. Companies generally keep data on the move, and servers around the world are constantly keeping data flowing, further away from its source.

Once we accept the terms and conditions we rarely read, our private data is no longer such. It is in the cloud, a term which has eluded concrete understanding throughout the years.

A distinction between cloud-based apps and cloud computing must be addressed. Cloud computing at an enterprise level, while argued against ad nauseam over the years, is generally considered to be a secure and cost-effective option for many businesses.

Even back in 2010, Microsoft said 70% of its team was working on things that were cloud-based or cloud-inspired, and the company projected that number would rise to 90% within a year. That was before we started relying on the cloud to store our most personal, private data.

Cloudy with a chance of confusion

To add complexity to this issue, there are literally apps to protect your privacy from other apps on your smart phone. Tearing more meat off the privacy bone, these apps themselves require a level of access that would generally raise eyebrows if it were any other category of app.

Consider the scenario where you use a key to encrypt data, but then you need to encrypt that key to make it safe. Ultimately, you end up with the most important keys not being encrypted. There is no win-win here. There is only finding a middle ground of contentment in which your apps find as much purchase in your private data as your doctor finds in your medical history.

The cloud is not tangible, nor is it something we as givers of the data can access. Each company has its own cloud servers, each one collecting similar data. But we have to consider why we give up this data. What are we getting in return? We are given access to applications that perhaps make our lives easier or better, but essentially are a service. It’s this service end of the transaction that must be altered.

App developers have to find a method of service delivery that does not require storage of personal data. There are two sides to this. The first is creating algorithms that can function on a local basis, rather than centralized and mixed with other data sets. The second is a shift in the general attitude of the industry, one in which free services are provided for the cost of your personal data (which ultimately is used to foster marketing opportunities).

Of course, asking this of any big data company that thrives on its data collection and marketing process is untenable. So the change has to come from new companies, willing to risk offering cloud privacy while still providing a service worth paying for. Because it wouldn’t be free. It cannot be free, as free is what got us into this situation in the first place.

Clearing the clouds of future privacy

What we can do right now is at least take a stance of personal vigilance. While there is some personal data that we cannot stem the flow of onto cloud servers around the world, we can at least limit the use of frivolous apps that collect too much data. For instance, games should never need access to our contacts, to our camera and so on. Everything within our phone is connected, it’s why Facebook seems to know everything about us, down to what’s in our bank account.

This sharing takes place on our phone and at the cloud level, and is something we need to consider when accepting the terms on a new app. When we sign into apps with our social accounts, we are just assisting the further collection of our data.

The cloud isn’t some omnipotent enemy here, but it is the excuse and tool that allows the mass collection of our personal data.

The future is likely one in which devices and apps finally become self-sufficient and localized, enabling users to maintain control of their data. The way we access apps and data in the cloud will change as well, as we’ll demand a functional process that forces a methodology change in service provisions. The cloud will be relegated to public data storage, leaving our private data on our devices where it belongs. We have to collectively push for this change, lest we lose whatever semblance of privacy in our data we have left.

Vantage makes managing AWS easier

Vantage, a new service that makes managing AWS resources and their associated spend easier, is coming out of stealth today. The service offers its users an alternative to the complex AWS console with support for most of the standard AWS services, including EC2 instances, S3 buckets, VPCs, ECS and Fargate and Route 53 hosted zones.

The company’s founder, Ben Schaechter, previously worked at AWS and Digital Ocean (and before that, he worked on Crunchbase, too). Yet while DigitalOcean showed him how to build a developer experience for individuals and small businesses, he argues that the underlying services and hardware simply weren’t as robust as those of the hyperclouds. AWS, on the other hand, offers everything a developer could want (and likely more), but the user experience leaves a lot to be desired.

Image Credits: Vantage

“The idea was really born out of ‘what if we could take the user experience of DigitalOcean and apply it to the three public cloud providers, AWS, GCP and Azure,” Schaechter told me. “We decided to start just with AWS because the experience there is the roughest and it’s the largest player in the market. And I really think that we can provide a lot of value there before we do GCP and Azure.”

The focus for Vantage is on the developer experience and cost transparency. Schaechter noted that some of its users describe it as being akin to a “Mint for AWS.” To get started, you give Vantage a set of read permissions to your AWS services and the tool will automatically profile everything in your account. The service refreshes this list once per hour, but users can also refresh their lists manually.

Given that it’s often hard enough to know which AWS services you are actually using, that alone is a useful feature. “That’s the number one use case,” he said. “What are we paying for and what do we have?”

At the core of Vantage is what the team calls “views,” which allows you to see which resources you are using. What is interesting here is that this is quite a flexible system and allows you to build custom views to see which resources you are using for a given application across regions, for example. Those may include Lambda, storage buckets, your subnet, code pipeline and more.

On the cost-tracking side, Vantage currently only offers point-in-time costs, but Schaechter tells me that the team plans to add historical trends as well to give users a better view of their cloud spend.

Schaechter and his co-founder bootstrapped the company and he noted that before he wants to raise any money for the service, he wants to see people paying for it. Currently, Vantage offers a free plan, as well as paid “pro” and “business” plans with additional functionality.

Image Credits: Vantage 

AWS launches Glue Elastic Views to make it easier to move data from one purpose-built data store to another

AWS has launched a new tool to let developers move data from one store to another called Glue Elastic Views.

At the AWS:Invent keynote CEO Andy Jassy announced Glue Elastic Views, a service that lets programmers move data across multiple data stores more seamlessly.

The new service can take data from disparate silos and move them together. That AWS ETL service allows programmers to write a little bit of SQL code to have a materialized view tht can move from one source data store to another.

For instance, Jassy said, a programmer can move data from DynamoDB to Elastic Search allowing a developer to set up a materialized view to copy that data — all the while managing dependencies. That means if data changes in the source data lake, then it will automatically be updated in the other data stores where the data has been relocated, Jassy said.

“When you have the ability to move data… and move that data easily from data store to data store… that’s incredibly powerful,” said Jassy.

Come June 1, 2021, all of your new photos will count against your free Google storage

Come June 1, 2021, Google will change its storage policies for free accounts — and not for the better. Basically, if you’re on a free account and a semi-regular Google Photos user, get ready to pay up next year and subscribe to Google One.

Currently, every free Google account comes with 15 GB of online storage for all your Gmail, Drive and Photos needs. Email and the files you store in Drive already counted against those 15 GB, but come June 1, all Docs, Sheets, Slides, Drawings, Forms or Jamboard files will count against the free storage as well. Those tend to be small files, but what’s maybe most important here, virtually all of your Photos uploads will now count against those 15 GB as well.

That’s a big deal because today, Google Photos lets you store unlimited images (and unlimited video, if it’s in HD) for free as long as they are under 16MP in resolution or you opt to have Google degrade the quality. Come June of 2021, any new photo or video uploaded in high quality, which currently wouldn’t count against your allocation, will count against those free 15 GB.

Image Credits: Google

As people take more photos every year, that free allotment won’t last very long. Google argues that 80% of its users will have at least three years to reach those 15 GB. Given that you’re reading TechCrunch, though, chances are you’re in the 20% that will run out of space much faster (or you’re already on a Google One plan).

Some good news: To make this transition a bit easier, photos and videos uploaded in high quality before June 1, 2021 will not count toward the 15 GB of free storage. As usual, original quality images will continue to count against it, though. And if you own a Pixel device, even after June 1, you can still upload an unlimited number of high-quality images from those.

To let you see how long your current storage will last, Google will now show you personalized estimates, too, and come next June, the company will release a new free tool for Photos that lets you more easily manage your storage. It’ll also show you dark and blurry photos you may want to delete — but then, for a long time Google’s promise was you didn’t have to worry about storage (remember Google’s old Gmail motto? “Archive, don’t delete!”).

In addition to these storage updates, there’s a few additional changes worth knowing about. If your account is inactive in Gmail, Drive or Photos for more than two years, Google “may” delete the content in that product. So if you use Gmail but don’t use Photos for two years because you use another service, Google may delete any old photos you had stored there. And if you stay over your storage limit for two years, Google “may delete your content across Gmail, Drive and Photos.”

Cutting back a free and (in some cases) unlimited service is never a great move. Google argues that it needs to make these changes to “continue to provide everyone with a great storage experience and to keep pace with the growing demand.”

People now upload more than 4.3 million GB to Gmail, Drive and Photos every day. That’s not cheap, I’m sure, but Google also controls every aspect of this and must have had some internal projections of how this would evolve when it first set those policies.

To some degree, though, this was maybe to be expected. This isn’t the freewheeling Google of 2010 anymore, after all. We’ve already seen some indications that Google may reserve some advanced features for Google One subscribers in Photos, for example. This new move will obviously push more people to pay for Google One and more money from Google One means a little bit less dependence on advertising for the company.

Microsoft announces its first Azure data center region in Taiwan

After announcing its latest data center region in Austria earlier this month and an expansion of its footprint in Brazil, Microsoft today unveiled its plans to open a new region in Taiwan. This new region will augment its existing presence in East Asia, where the company already runs data centers in China (operated by 21Vianet), Hong Kong, Japan and Korea. This new region will bring Microsoft’s total presence around the world to 66 cloud regions.

Similar to its recent expansion in Brazil, Microsoft also pledged to provide digital skilling for over 200,000 people in Taiwan by 2024 and it is growing its Taiwan Azure Hardware Systems and Infrastructure engineering group, too. That’s in addition to investments in its IoT and AI research efforts in Taiwan and the startup accelerator it runs there.

“Our new investment in Taiwan reflects our faith in its strong heritage of hardware and software integration,” said Jean-Phillippe Courtois, Executive Vice President and President, Microsoft Global Sales, Marketing and Operations. “With Taiwan’s expertise in hardware manufacturing and the new datacenter region, we look forward to greater transformation, advancing what is possible with 5G, AI and IoT capabilities spanning the intelligent cloud and intelligent edge.”

Image Credits: Microsoft

The new region will offer access to the core Microsoft Azure services. Support for Microsoft 365, Dynamics 365 and Power Platform. That’s pretty much Microsoft’s playbook for launching all of its new regions these days. Like virtually all of Microsoft’s new data center region, this one will also offer multiple availability zones.

Drew Houston will talk about building a startup and digital transformation during COVID at TechCrunch Disrupt

Dropbox CEO Drew Houston will be joining us for a one on one interview at this year’s TechCrunch Disrupt happening next week from September 14-18.

Houston has been there and done that as a startup founder. After attending Y Combinator in 2007 and launching at the TechCrunch 50 (the precursor to TechCrunch Disrupt) in 2008, he went on to raise $1.7 billion from firms like Blackrock, Sequoia and Index Ventures before taking his company public in 2018.

Houston and his co-founder Arash Ferdowsi had a simple idea to make it easier to access your stuff on the internet. Instead of carrying your files on a thumb drive or emailing them to yourself, as was the norm at that time, you could have a hard drive in the cloud. This meant that you could log on wherever you were, even when you were not on your own computer, and access your files.

Houston and Ferdowsi wanted to make it dead simple to do this, and in the days before smart phones and tablets,  they achieved that goal and grew a company that reported revenue of $467.4 million — or a run rate of over $1.8 billion — in its most recent earning’s report. Today, Dropbox has a market cap of over $8 billion.

And as we find ourselves in the midst of pandemic, businesses like Houston’s are suddenly hotter than ever, as companies are accelerating their move to the cloud with employees working from home needing access to work files and the ability to share them easily with colleagues in a secure way.

Dropbox has expanded beyond pure consumer file sharing in the years since the company launched with business tools for sharing files with teams, administering and securing them from a central console, and additional tools like a password manager, online vault for important files, full backup and electronic signature and workflow via the purchase of HelloSign last year.

Houston will join us at TechCrunch Disrupt 2020 to discuss all of this including how he helped build the company from that initial idea to where it is today, and he will talk about what it takes to achieve the kind of success that every startup founder dreams about. Get your Digital Pro Pass or your Startup Alley Exhibitor Package or even a Digital Pass for $45 to hear this session on the Disrupt stage . We hope you’ll join us.

Google One now offers free phone backups up to 15GB on Android and iOS

Google One, Google’s subscription program for buying additional storage and live support, is getting an update today that will bring free phone backups for Android and iOS devices to anybody who installs the app — even if they don’t have a paid membership. The catch: While the feature is free, the backups count against your free Google storage allowance of 15GB. If you need more you need — you guessed it — a Google One membership to buy more storage or delete data you no longer need. Paid memberships start at $1.99/month for 100GB.

Image Credits: Google

Last year, paid members already got access to this feature on Android, which stores your texts, contacts, apps, photos and videos in Google’s cloud. The “free” backups are now available to Android users. iOS users will get access to it once the Google One app rolls out on iOS in the near future.

Image Credits: Google

With this update, Google is also introducing a new storage manager tool in Google One, which is available in the app and on the web, and which allows you to delete files and backups as needed. The tool works across Google properties and lets you find emails with very large attachments or large files in your Google Drive storage, for example.

With this free backup feature, Google is clearly trying to get more people onto Google One. The free 15GB storage limit is pretty easy to hit, after all (and that’s for your overall storage on Google, including Gmail and other services) and paying $1.99 for 100GB isn’t exactly a major expense, especially if you are already part of the Google ecosystem and use apps like Google Photos already.

Wasabi announces $30M in debt financing as cloud storage business continues to grow

We may be in the thick of a pandemic with all of the economic fallout that comes from that, but certain aspects of technology don’t change no matter the external factors. Storage is one of them. In fact, we are generating more digital stuff than ever, and Wasabi, a Boston-based startup that has figured out a way to drive down the cost of cloud storage is benefiting from that.

Today it announced a $30 million debt financing round led led by Forestay Capital, the technology innovation arm of Waypoint Capital with help from previous investors. As with the previous round, Wasabi is going with home office investors, rather than traditional venture capital firms. Today’s round brings the total raised to $110 million, according to the company.

Founder and CEO David Friend says the company needs the funds to keep up with the rapid growth. “We’ve got about 15,000 customers today, hundreds of petabytes of storage, 2500 channel partners, 250 technology partners — so we’ve been busy,” he said.

He says that revenue continues to grow in spite of the impact of COVID-19 on other parts of the economy. “Revenue grew 5x last year. It’ll probably grow 3.5x this year. We haven’t seen any real slowdown from the Coronavirus. Quarter over quarter growth will be in excess of 40% — this quarter over Q1 — so it’s just continuing on a torrid pace,” he said.

He said the money will be used mostly to continue to expand its growing infrastructure requirements. The more they store, the more data centers they need and that takes money. He is going the debt route because his products are backed by a tangible asset, the infrastructure used to store all the data in the Wasabi system. And it turns out that debt financing is a lot cheaper in terms of payback than equity terms.

“Our biggest need is to build more infrastructure, because we are constantly buying equipment. We have to pay for it even before it fills up with customer data, so we’re raising another debt round now,” Friend said. He added, “Part of what we’re doing is just strengthening our balance sheet to give us access to more inexpensive debt to finance the building of the infrastructure.”

The challenge for a company like Wasabi, which is looking to capture a large chunk of the growing cloud storage market is the infrastructure piece. It needs to keep building more to meet increasing demand, while keeping costs down, which remains its primary value proposition with customers.

The money will help the company expand into new markets as many countries have data sovereignty laws that require data to be stored in-country. That requires more money and that’s the thinking behind this round.

The company launched in 2015. It previously raised $68 million in 2018.

Rallyhood exposed a decade of users’ private data

Rallyhood says it’s “private and secure.” But for some time, it wasn’t.

The social network designed to help groups communicate and coordinate left one of its cloud storage buckets containing user data open and exposed. The bucket, hosted on Amazon Web Services (AWS), was not protected with a password, allowing anyone who knew the easily-guessable web address access to a decade’s worth of user files.

Rallyhood boasts users from Girl Scout and Boy Scout troops, and Komen, Habitat for Humanities, and YMCA factions. The company also hosts thousands of smaller groups, like local bands, sports teams, art clubs, and organizing committees. Many flocked to the site after Rallyhood said it would help migrate users from Yahoo Groups, after Verizon (which also owns TechCrunch) said it would shut down the discussion forum site last year.

The bucket contained group data as far back to 2011 up to and including last month. In total, the bucket contained 4.1 terabytes of uploaded files, representing millions of users’ files.

Some of the files we reviewed contained sensitive data, like shared password lists and contracts or other permission slips and agreements. The documents also included non-disclosure agreements and other files that were not intended to be public.

Where we could identify contact information of users whose information was exposed, TechCrunch reached out to verify the authenticity of the data.

A security researcher who goes by the handle Timeless found the exposed bucket and informed TechCrunch, so that the bucket and its files could be secured.

When reached, Rallyhood chief technology officer Chris Alderson initially claimed that the bucket was for “testing” and that all user data was stored “in a highly secured bucket,” but later admitted that during a migration project, “there was a brief period when permissions were mistakenly left open.”

It’s not known if Rallyhood plans to warn its users and customers of the security lapse. At the time of writing, Rallyhood has made no statement on its website or any of its social media profiles of the incident.

How Spotify ran the largest Google Dataflow job ever for Wrapped 2019

In early December, Spotify launched its annual personalized Wrapped playlist with its users’ most-streamed sounds of 2019. That has become a bit of a tradition and isn’t necessarily anything new, but for 2019, it also gave users a look back at how they used Spotify over the last decade. Because this was quite a large job, Spotify gave us a bit of a look under the covers of how it generated these lists for its ever-growing number of free and paid subscribers.

It’s no secret that Spotify is a big Google Cloud Platform user. Back in 2016, the music streaming service publicly said that it was going to move to Google Cloud, after all, and in 2018, it disclosed that it would spend at least $450 million on its Google Cloud infrastructure in the following three years.

It was also back in 2018, for that year’s Wrapped, that Spotify ran the largest Google Cloud Dataflow job ever run on the platform, a service the company started experimenting with a few years earlier. “Back in 2015, we built and open-sourced a big data processing Scala API for Apache Beam and Google Cloud Dataflow called Scio,” Spotify’s VP of Engineering Tyson Singer told me. “We chose Dataflow over Dataproc because it scales with less operational overhead and Dataflow fit with our expected needs for streaming processing. Now we have a great open-source toolset designed and optimized for Dataflow, which in addition to being used by most internal teams, is also used outside of Spotify.”

For Wrapped 2019, which includes the annual and decadal lists, Spotify ran a job that was five times larger than in 2018 — but it did so at three-quarters of the cost. Singer attributes this to his team’s familiarity with the platform. “With this type of global scale, complexity is a natural consequence. By working closely with Google Cloud’s engineering teams and specialists and drawing learnings from previous years, we were able to run one of the most sophisticated Dataflow jobs ever written.”

Still, even with this expertise, the team couldn’t just iterate on the full data set as it figured out how to best analyze the data and use it to tell the most interesting stories to its users. “Our jobs to process this would be large and complex; we needed to decouple the complexity and processing in order to not overwhelm Google Cloud Dataflow,” Singer said. “This meant that we had to get more creative when it came to going from idea, to data analysis, to producing unique stories per user, and we would have to scale this in time and at or below cost. If we weren’t careful, we risked being wasteful with resources and slowing down downstream teams.”

To handle this workload, Spotify not only split its internal teams into three groups (data processing, client-facing and design, and backend systems), but also split the data processing jobs into smaller pieces. That marked a very different approach for the team. “Last year Spotify had one huge job that used a specific feature within Dataflow called “Shuffle.” The idea here was that having a lot of data, we needed to sort through it, in order to understand who did what. While this is quite powerful, it can be costly if you have large amounts of data.”

This year, the company’s engineers minimized the use of Shuffle by using Google Cloud’s Bigtable as an intermediate storage layer. “Bigtable was used as a remediation tool between Dataflow jobs in order for them to process and store more data in a parallel way, rather than the need to always regroup the data,” said Singer. “By breaking down our Dataflow jobs into smaller components — and reusing core functionality — we were able to speed up our jobs and make them more resilient.”

Singer attributes at least a part of the cost savings to this technique of using Bigtable, but he also noted that the team decomposed the problem into data collection, aggregation and data transformation jobs, which it then split into multiple separate jobs. “This way, we were not only able to process more data in parallel, but be more selective about which jobs to rerun, keeping our costs down.”

Many of the techniques the engineers on Singer’s teams developed are currently in use across Spotify. “The great thing about how Wrapped works is that we are able to build out more tools to understand a user, while building a great product for them,” he said. “Our specialized techniques and expertise of Scio, Dataflow and big data processing, in general, is widely used to power Spotify’s portfolio of products.”