Flawed data is putting people with disabilities at risk

Data isn’t abstract — it has a direct impact on people’s lives.

In 2019, an AI-powered delivery robot momentarily blocked a wheelchair user from safely accessing the curb when crossing a busy road. Speaking about the incident, the person noted how “it’s important that the development of technologies [doesn’t put] disabled people on the line as collateral”.

Alongside other minority groups, people with disabilities have long been harmed by flawed data and data tools. Disabilities are diverse, nuanced, and dynamic; they don’t fit within the formulaic structure of AI, which is programmed to find patterns and form groups. Because AI treats any outlier data as ‘noise’ and disregards it, too often people with disabilities are excluded from its conclusions.

Take for example the case of Elaine Herzberg, who was struck and killed by a self-driving Uber SUV in 2018. At the time of the collision, Herzberg was pushing a bicycle, which meant Uber’s system struggled to categorize her and flitted between labeling her as a ‘vehicle,’ ‘bicycle,’ and ‘other.’ The tragedy raised many questions for people with disabilities: would a person in a wheelchair or a scooter be at risk of the same fatal misclassification?

We need a new way of collecting and processing data. ‘Data’ ranges from personal information, user feedback, resumes, multimedia, user metrics, and much more, and it’s constantly being used to optimize our software. However, it’s not done so with the understanding of the spectrum of nefarious ways that it can and is used in the wrong hands, or when principles are not applied to each touchpoint of building.

Our products are long overdue for a new, fairer data framework to ensure that data is managed with people with disabilities in mind. If it isn’t, people with disabilities will face more friction, and dangers, in a day-to-day life that is increasingly dependent on digital tools.

Misinformed data hampers the building of good tools

Products that lack accessibility might not stop people with disabilities from leaving their homes, but they can stop them from accessing pivot points of life like quality healthcare, education, and on-demand deliveries.

Our tools are a product of their environment. They reflect their creators’ world view and subjective lens. For too long, the same groups of people have been overseeing faulty data systems. It’s a closed loop, where underlying biases are perpetuated and groups that were already invisible remain unseen. But as data progresses, that loop becomes a snowball. We’re dealing with machine-learning models — if they’re taught long enough that ‘not being X’ (read: white, able-bodied, cisgendered) means not being ‘normal’, they will evolve by building on that foundation.

Data is interlinked in ways that are invisible to us. It’s not enough to say that your algorithm won’t exclude people with registered disabilities. Biases are present in other sets of data. For example, in the United States it’s illegal to refuse someone a mortgage loan because they’re Black. But by basing the process heavily on credit scores — which have inherent biases detrimental to people of color — banks indirectly exclude that segment of society.

For people with disabilities, indirectly biased data could potentially be: frequency of physical activity or number of hours commuted per week. Here’s a concrete example of how indirect bias translates to software: If a hiring algorithm studies candidates’ facial movements during a video interview, a person with a cognitive disability or mobility impairment will experience different barriers than a fully able-bodied applicant.

The problem also stems from people with disabilities not being viewed as part of businesses’ target market. When companies are in the early stage of brainstorming their ideal users, people’s disabilities often don’t figure, especially when they’re less noticeable — like mental health illness. That means the initial user data used to iterate products or services doesn’t come from these individuals. In fact, 56% of organizations still don’t routinely test their digital products among people with disabilities.

If tech companies proactively included individuals with disabilities on their teams, it’s far more likely that their target market would be more representative. In addition, all tech workers need to be aware of and factor in the visible and invisible exclusions in their data. It’s no simple task, and we need to collaborate on this. Ideally, we’ll have more frequent conversations, forums and knowledge-sharing on how to eliminate indirect bias from the data we use daily.

We need an ethical stress test for data

We test our products all the time — on usability, engagement, and even logo preferences. We know which colors perform better to convert paying customers, and the words that resonate most with people, so why aren’t we setting a bar for data ethics?

Ultimately, the responsibility of creating ethical tech does not just lie at the top. Those laying the brickwork for a product day after day are also liable. It was the Volkswagen engineer (not the company CEO) who was sent to jail for developing a device that enabled cars to evade US pollution rules.

Engineers, designers, product managers: we all have to acknowledge the data in front of us and think about why we collect it and how we collect it. That means dissecting the data we’re requesting and analyzing what our motivations are. Does it always make sense to ask about someone’s disabilities, sex or race? How does having this information benefit the end user?

At Stark, we’ve developed a five-point framework to run when designing and building any kind of software, service or tech. We have to address:

  1. What data we’re collecting
  2. Why we’re collecting it
  3. How it will be used (and how it can be misused)
  4. Simulate IFTTT: ‘if this, then that.’ Explain possible scenarios in which the data can be used nefariously, and alternate solutions. For instance, how users can be impacted by an at-scale data breach? What happens if this private information becomes public to their family and friends?
  5. Ship or trash the idea

If we can only explain our data using vague terminology and unclear expectations, or by stretching the truth, we shouldn’t be allowed to have that data. The framework forces us to break down data in the most simple manner; and if we can’t, it’s because we’re not yet equipped to handle it responsibly.

Innovation has to include people with disabilities

Complex data technology is entering new sectors all the time, from vaccine development to robotaxis. Any bias against individuals with disabilities in these sectors stops them from accessing the most cutting-edge products and services. As we become more dependent on tech in every niche of our lives, there’s greater room for exclusion in how we carry out everyday activities.

This is all about forward thinking and baking inclusion into your product at the start. Money and/or experience aren’t limiting factors here — changing your thought process and development journey is free, it’s just a conscious pivot in a better direction. And while the upfront cost may be a heavy lift, the profits you’d lose from not tapping into these markets, or because you end up retrofitting your product down the line, far outweigh that initial expense. This is especially true for enterprise-level companies that won’t be able to access academia or governmental contracts without being compliant.

So early-stage companies, integrate accessibility principles into your product development and gather user data to constantly reinforce those principles. Sharing data across your onboarding, sales, and design teams will give you a more complete picture of where your users are experiencing difficulties. Later-stage companies should carry out a self-assessment to determine where those principles are lacking in their product, and harness historical data and new user feedback to generate a fix.

An overhaul of AI and data isn’t just about adapting businesses’ framework. We still need the people at the helm to be more diverse. The fields remain overwhelmingly male and white, and in tech, there are numerous first-hand accounts of exclusion and bias towards people with disabilities. Until the teams curating data tools are themselves more diverse, nations’ growth will continue to be stifled, and people with disabilities will be some of the hardest-hit casualties.

Outdoor startups see supercharged growth during COVID-19 era

After years of sustained growth, the pandemic supercharged the outdoor recreation industry. Startups that provide services like camper vans, private campsites and trail-finding apps became relevant to millions of new users when COVID-19 shut down indoor recreation, building on an existing boom in outdoor recreation.

Startups like Outdoorsy, AllTrails, Cabana, Hipcamp, Kibbo and Lowergear Outdoors have seen significant growth, but to keep it going, consumers who discovered a fondness for the great outdoors during the pandemic must turn it into a lifelong interest.

Outdoorsy, AllTrails, Cabana, Hipcamp, Kibbo and Lowergear Outdoors have seen significant growth, but to keep it going, consumers who discovered a fondness for the great outdoors during the pandemic must turn it into a lifelong interest.

Social media, increased environmentalism and high urbanization were already fueling a boom in popularity. There was a 72% increase in people who camp more than three times a year between 2014 and 2019, mostly spurred by young millennials, young families with kids and nonwhite participants.

But 2020 was a different animal: After months of shelter-in-place orders, widespread shutdowns and physical distancing, outdoors became the only location for safe socializing. In South Dakota, the Lewis and Clark Recreation Area saw a 59% increase in visitors from 2019 to 2020. In the pandemic year, consumers spent $887 billion on outdoor recreation according to the Outdoor Industry Association, more than pharmaceuticals and fuel combined.

And it’s going to continue to grow. Hiking equipment alone is supposed to reach a $7.4 billion market size by 2027, a 6.3% compound annual growth rate. Camping and caravanning is having an even more drastic moment. Without international travel, vacations shifted from flights to exotic resorts to domestic road trips, self-contained rentals and camping. In 2020, the market for camping and caravanning was almost $40 billion and is predicted to rise 13% to just over $45 billion this year.

After the initial and extreme drop-off in engagement early as national parks closed, private camping sites shut down and domestic travel ceased, many outdoor startups have had a breakout year. Outdoorsy, the peer-to-peer camper van rental marketplace, said it saw 44% of all bookings in the company’s history in 2020.

Campsite booking platform Hipcamp said it sent three times as much money to landowners in 2020 as compared to 2019. And it’s not just experienced outdoor veterans taking advantage of the work-from-home lifestyle: in 2020, Cabana, a camper van rental startup, said 70% of its customers had never rented a camper van or an RV before and another 26% had only done it once.

But a report commissioned by the Outdoor Industry Association showed that the most popular outdoor activities were ones that people could do close to home, not the traveling kind Hipcamp, Cabana and Outdoorsy traffic in. The three most popular outdoor activities for newbies: walking, running and bicycling.

But the pandemic did create a small boost for camping, climbing, backpacking and kayaking; fueled by an increase in women, younger, more ethnically diverse, urban and slightly less wealthy people pushing into the outdoors. This class of outdoor startups will need to engage the new demographic shift to capitalize on the pandemic’s outdoor boom because, according to the report, a quarter of those who started new outdoor activities during the pandemic don’t plan on continuing once it’s over.

Startups are increasing accessibility to the outdoors

But getting into the outdoors can be overwhelming: there’s gear to buy, skills to learn, exploring unfamiliar areas and the added stressor of safety. Outdoor startups are working to lower the barrier to entry to help grow their businesses.

“I think anytime you have like 2,000 articles with two dozen tips on how to use a product, that tells me that it is really, really too hard to use,” said Cabana founder Scott Kubly. “To me, that says there’s nothing but friction in this process. If you want to build something that’s mainstream, you need to make it super consistent and really easy to use.”

Kubly said only half a percent of the U.S. population takes a rental van or RV trip each year. Planning an outdoor adventure can be time-consuming — choosing a location, finding an open campsite, planning meals and water, and figuring out dump stations for trash or septic. That planning is multiplied tenfold if you are going for a road trip or backpacking and need to find new places every other night.

Business continuity planning is a necessity for your fund and portfolio

Just shy of a year ago, I sent an email to our global fund manager partners and to our direct portfolio CEOs titled “Only the decisive survive.” At that time, not many outside of China were concerned about COVID-19. However, I was obsessed.

Hearing stories from fund manager friends with operations in China, I knew things were worse than what the Chinese press were telling the world. And I live only five miles south of the location of the first COVID death in the U.S. The pandemic was accelerating exponentially, and I wanted to get all of our partners to open their eyes to the risks and prepare as well as they could.

I’m not writing with that level of intensity or urgency this time, but I am concerned. We all need to be taking precautionary measures, not just in light of COVID, but to ensure our firms can continue to thrive when faced with unexpected tragedy.

We all need to be taking precautionary measures, not just in light of COVID, but to ensure our firms can continue to thrive when faced with unexpected tragedy.

My partner Susana invested in 90 funds over 20 years — she’s seen everything from motorcycle accidents to depression take out fund managers and CEOs. Life works that way sometimes, and it’s not always someone else. It’s the “What happens if I get hit by a bus scenario?” In this case, the bus happens to be a global pandemic.

One of our funds in Asia recently reported COVID cases in three CEOs among their 23 companies. While developed market infections and deaths are trending down, many countries are seeing serious new outbreaks, and some, like Brazil, are doing badly.

Pandemic forecasting site IHME predicts a growing caseload across sub-Saharan Africa and East Asia and Pacific regions. The LAC region is trending down overall, but some countries, including Colombia, are expected to experience a second (or third) wave of infections.

As the Economist said in mid-February, “Coronavirus is not done with humanity yet.”

Planning for your fund

A month or so ago, we were trying to move forward with an investment in a fund in Africa with whom we had been speaking and doing due diligence for a few months. They went radio silent for over two weeks. We didn’t know whether to be miffed, concerned for their health, or what.

Data scientists: Bring the narrative to the forefront

By 2025, 463 exabytes of data will be created each day, according to some estimates. (For perspective, one exabyte of storage could hold 50,000 years of DVD-quality video.) It’s now easier than ever to translate physical and digital actions into data, and businesses of all types have raced to amass as much data as possible in order to gain a competitive edge.

However, in our collective infatuation with data (and obtaining more of it), what’s often overlooked is the role that storytelling plays in extracting real value from data.

The reality is that data by itself is insufficient to really influence human behavior. Whether the goal is to improve a business’ bottom line or convince people to stay home amid a pandemic, it’s the narrative that compels action, rather than the numbers alone. As more data is collected and analyzed, communication and storytelling will become even more integral in the data science discipline because of their role in separating the signal from the noise.

Data alone doesn’t spur innovation — rather, it’s data-driven storytelling that helps uncover hidden trends, powers personalization, and streamlines processes.

Yet this can be an area where data scientists struggle. In Anaconda’s 2020 State of Data Science survey of more than 2,300 data scientists, nearly a quarter of respondents said that their data science or machine learning (ML) teams lacked communication skills. This may be one reason why roughly 40% of respondents said they were able to effectively demonstrate business impact “only sometimes” or “almost never.”

The best data practitioners must be as skilled in storytelling as they are in coding and deploying models — and yes, this extends beyond creating visualizations to accompany reports. Here are some recommendations for how data scientists can situate their results within larger contextual narratives.

Make the abstract more tangible

Ever-growing datasets help machine learning models better understand the scope of a problem space, but more data does not necessarily help with human comprehension. Even for the most left-brain of thinkers, it’s not in our nature to understand large abstract numbers or things like marginal improvements in accuracy. This is why it’s important to include points of reference in your storytelling that make data tangible.

For example, throughout the pandemic, we’ve been bombarded with countless statistics around case counts, death rates, positivity rates, and more. While all of this data is important, tools like interactive maps and conversations around reproduction numbers are more effective than massive data dumps in terms of providing context, conveying risk, and, consequently, helping change behaviors as needed. In working with numbers, data practitioners have a responsibility to provide the necessary structure so that the data can be understood by the intended audience.

Enterprise security attackers are one password away from your worst day

If the definition of insanity is doing the same thing over and over and expecting a different outcome, then one might say the cybersecurity industry is insane.

Criminals continue to innovate with highly sophisticated attack methods, but many security organizations still use the same technological approaches they did 10 years ago. The world has changed, but cybersecurity hasn’t kept pace.

Distributed systems, with people and data everywhere, mean the perimeter has disappeared. And the hackers couldn’t be more excited. The same technology approaches, like correlation rules, manual processes and reviewing alerts in isolation, do little more than remedy symptoms while hardly addressing the underlying problem.

The current risks aren’t just technology problems; they’re also problems of people and processes.

Credentials are supposed to be the front gates of the castle, but as the SOC is failing to change, it is failing to detect. The cybersecurity industry must rethink its strategy to analyze how credentials are used and stop breaches before they become bigger problems.

It’s all about the credentials

Compromised credentials have long been a primary attack vector, but the problem has only grown worse in the midpandemic world. The acceleration of remote work has increased the attack footprint as organizations struggle to secure their network while employees work from unsecured connections. In April 2020, the FBI said that cybersecurity attacks reported to the organization grew by 400% compared to before the pandemic. Just imagine where that number is now in early 2021.

It only takes one compromised account for an attacker to enter the active directory and create their own credentials. In such an environment, all user accounts should be considered as potentially compromised.

Nearly all of the hundreds of breach reports I’ve read have involved compromised credentials. More than 80% of hacking breaches are now enabled by brute force or the use of lost or stolen credentials, according to the 2020 Data Breach Investigations Report. The most effective and commonly-used strategy is credential stuffing attacks, where digital adversaries break in, exploit the environment, then move laterally to gain higher-level access.

Building customer-first relationships in a privacy-first world is critical

In business today, many believe that consumer privacy and business results are mutually exclusive — to excel in one area is to lack in the other. Consumer privacy is seen by many in the technology industry as an area to be managed.

But the truth is, the companies who champion privacy will be better positioned to win in all areas. This is especially true as the digital industry continues to undergo tectonic shifts in privacy — both in government regulation and browser updates.

By the end of 2022, all major browsers will have phased out third-party cookies — the tracking codes placed on a visitor’s computer generated by another website other than your own. Additionally, mobile device makers are limiting identifiers allowed on their devices and applications. Across industry verticals, the global enterprise ecosystem now faces a critical moment in which digital advertising will be forever changed.

Up until now, consumers have enjoyed a mostly free internet experience, but as publishers adjust to a cookieless world, they could see more paywalls and less free content.

They may also see a decrease in the creation of new free apps, mobile gaming and other ad-supported content unless businesses find new ways to authenticate users and maintain a value exchange of free content for personalized advertising.

The truth is, the companies who champion privacy will be better positioned to win in all areas.

When consumers authenticate themselves to brands and sites, they create revenue streams for publishers as well as the opportunity to receive discounts, first-looks and other specially tailored experiences from brands.

To protect consumer data, companies need to architect internal systems around data custodianship versus acting from a sense of data entitlement. While this is a challenging and massive ongoing evolution, the benefits of starting now are enormous.

Putting privacy front and center creates a sustainable digital ecosystem that enables better advertising and drives business results. There are four steps to consider when building for tomorrow’s privacy-centric world:

Transparency is key

As we collectively look to redesign how companies interact with and think about consumers, we should first recognize that putting people first means putting transparency first. When people trust a brand or publishers’ intentions, they are more willing to share their data and identity.

This process, where consumers authenticate themselves — or actively share their phone number, email or other form of identity — in exchange for free content or another form of value, allows brands and publishers to get closer to them.

From pickup basketball to market domination: My wild ride with Coupang

A month ago, Coupang arrived on Wall Street with a bang. The South Korean e-commerce giant — buoyed by $12 billion in 2020 revenue — raised $4.55 billion in its IPO and hit a valuation as high as $109 billion. It is the biggest U.S. IPO of the year so far, and the largest from an Asian company since Alibaba’s.

But long before founder Bom Kim rang the bell, I knew him as a fellow founder on the hunt for a good idea. We stayed in touch as he formed his vision for what would become Coupang, and I built it alongside him as an investor and board member.

As a board member, I’ve observed a brief quiet period following the IPO. But now I want to share how exactly our paths intersected, largely because Bom exemplifies what founders should aspire to and should seek: big risks, dogged determination, and obsessive responsiveness to the market.

Bom fearlessly turned down an acquisition offer from then-market-leader Groupon, ferociously learned what he didn’t know, made a daring pivot even after becoming a billion-dollar company, and iteratively built a vision for end-to-end market dominance.

Why I like talking to founders early

In 2008, I met Bom while playing a weekend game of pickup basketball at Stuyvesant High School. We realized we had a mutual acquaintance through my recently-sold startup, Community Connect Inc. He told me about the magazine he had sold and his search for a next move. So we agreed to meet up for lunch and go over some of his ideas.

To be honest, I don’t remember any of those early ideas, probably because they weren’t very good. But I really liked Bom. Even as I was crapping on his ideas, I could tell he was sharp from how he processed my feedback. It was obvious he was super smart and definitely worth keeping in touch with, which we continued to do even after he relocated to go to HBS.

I soon began investing in and incubating businesses, starting mostly with my own capital. When I got a call from an executive recruiter working for a company in Chicago called Groupon — who told me they were at a $50 million run rate in only a few months — I became fascinated with their model and started talking to some of the investors, former employees, and merchants.

Inspired, and as a new parent, I decided to launch a similar daily-deal business for families: Instead of skydiving and go-kart racing, we offered deals on kids’ music classes and birthday party venues. While I was working on this idea, John Ason, an angel investor in Diapers.com, said I should meet with the founder and CEO Marc Lore. By the end of the meeting, Marc and I etched a partnership to launch DoodleDeals.com co-branded with Diapers.com. The first deal did over $70,000 — great start.

I’ve observed a brief quiet period following the IPO. But now I want to share how exactly our paths intersected, largely because Bom exemplifies what founders should aspire to and should seek: big risks, dogged determination, and obsessive responsiveness to the market.

All that time, I kept in touch with Bom. In February 2010, we were catching up over lunch at the Union Square Ippudo, and he asked if I had heard of Buywithme, a Boston-based Groupon clone. He hadn’t yet heard about Groupon, so I explained the business model and shared the numbers. He thought something similar might transfer well to South Korea, where he was born and his parents still lived.

This kind of conversation is exactly why I love working with founders early, even before the idea forms: You learn a lot about them as they explore, wrestle with uncertainty, and eventually build conviction on a business they plan to spend the next decade-plus building. Ultimately, success comes down to founders’ belief in themselves; when you develop the same belief in them as an investor, it is pretty magical. I was starting to really believe in Bom.

The idea gets real — and moves fast

I'm not Korean — I am ethnically Chinese — so Bom put together slides on the Korean market and why it was perfect for the daily-deal model. In short: a very dense population that’s incredibly online.

I’m not Korean — I am ethnically Chinese — so Bom put together slides on the Korean market and why it was perfect for the daily-deal model. In short: a very dense population that’s incredibly online. Image Credits: Ben Sun

I told Bom he should drop out of business school and do this. He said, “You don’t think I can wait until I graduate?” I responded, “No way! It will be over by then!”

First-mover advantage is real in a business like this, and it didn’t take Bom long to see that. He raised a small $1.3 million seed round. I invested, joined the board. Because of my knowledge of the deals market and my entrepreneurial experience, Bom asked me to get hands-on in Korea — not at all typical for an investor or even a board member, but I think of myself as a builder and not just a backer, and this is how I wanted to operate as an investor.

Once he realized time was of the essence, Bom was heads down. For context, he was engaged to his longtime girlfriend, Nancy, who also went to Harvard undergrad and was a successful lawyer. Imagine telling your fiancée, “Honey, I am dropping out of business school, moving to Korea to start a company. I will be back for the wedding. Not sure if I will ever be coming back to the U.S.”

I emailed Bom, saying: “Bom — honestly as a friend. Enjoy your wedding. It is a real blessing that your fiancée is being so supportive of you doing this. Launching a site a few weeks before the wedding is going to be way too distracting and she won’t feel like your heart is in it. Launching a few weeks later is not going to make or break this business. Trust me.”

Bom didn’t listen. He launched Coupang in August 2010, two weeks before the wedding. He flew back to Boston, got married, and — running on basically no sleep — sneaked out for a 20-minute nap in the middle of his reception. Right after the wedding, he flew back to Seoul. Nancy has to be one of the most supportive and understanding partners I have ever seen. They are now married and have two kids.

Jumping on new distribution, turning down an acquisition offer