Among the many, many tasks required of grade school teachers is that of gauging each student’s reading level, usually by a time-consuming and high-pressure one-on-one examination. Microsoft’s new Reading Progress application takes some of the load off the teacher’s shoulders, allowing kids to do their reading at home and using natural language understanding to help highlight obstacles and progress.
The last year threw most educational plans into disarray, and reading levels did not advance the way they would have if kids were in school. Companies like Amira are emerging to fill the gap with AI-monitored reading, and Microsoft aims to provide teachers with more tools on their side.
Reading Progress is an add-on for Microsoft Teams that helps teachers administer reading tests in a more flexible way, taking pressure off students who might stumble in a command performance, and identifying and tracking important reading events like skipped words and self-corrections.
Teachers pick reading assignments for each students (or the whole class) to read, and the kids do so on their own time, more like doing homework than taking a test. They record a video directly in the app, the audio of which is analyzed by algorithms watching for the usual stumbles.
As you can see in this video testimony by 4th grader Brielle, this may be preferable to many kids:
If a bright and confident kid like Brielle feels better doing it this way (and is now reading two years ahead of her grade, nice work Brielle!), what about the kids who are having trouble reading due to dyslexia, or are worried about their accent, or are simply shy? Being able to just talk to their own camera, by themselves in their own home, could make for a much better reading — and therefore a more accurate assessment.
It’s not meant to replace the teacher altogether, of course — it’s a tool that allows overloaded educators to prioritize and focus better and track things more objectively. It’s similar to how Amira is not meant to replace in-person reading groups — impossible during the pandemic — but provides a similarly helpful process of quickly correcting common mistakes and encouraging the reader.
Of the many frustrations of having a severe motor impairment, the difficulty of communicating must surely be among the worst. The tech world has not offered much succor to those affected by things like locked-in syndrome, ALS, and severe strokes, but startup Cognixion aims to with a novel form of brain monitoring that, combined with a modern interface, could make speaking and interaction far simpler and faster.
The company’s One headset tracks brain activity closely in such a way that the wearer can direct a cursor — reflected on a visor like a heads-up display — in multiple directions or select from various menus and options. No physical movement is needed, and with the help of modern voice interfaces like Alexa, the user can not only communicate efficiently but freely access all kinds of information and content most people take for granted.
But it’s not a miracle machine, and it isn’t a silver bullet. Here’s where how it got started.
Overhauling decades-old brain tech
Everyone with a motor impairment has different needs and capabilities, and there are a variety of assistive technologies that cater to many of these needs. But many of these techs and interfaces are years or decades old — medical equipment that hasn’t been updated for an era of smartphones and high-speed mobile connections.
Some of the most dated interfaces, unfortunately, are those used by people with the most serious limitations: those whose movements are limited to their heads, faces, eyes — or even a single eyelid, like Jean-Dominique Bauby, the famous author of “The Diving Bell and the Butterfly.”
One of the tools in the toolbox is the electroencephalogram, or EEG, which involves detecting activity in the brain via patches on the scalp that record electrical signals. But while they’re useful in medicine and research in many ways, EEGs are noisy and imprecise — more for finding which areas of the brain are active than, say, which sub-region of the sensory cortex or the like. And of course you have to wear a shower cap wired with electrodes (often greasy with conductive gel) — it’s not the kind of thing anyone wants to do for more than an hour, let alone all day every day.
Yet even among those with the most profound physical disabilities, cognition is often unimpaired — as indeed EEG studies have helped demonstrate. It made Andreas Forsland, co-founder and CEO of Cognixion, curious about further possibilities for the venerable technology: “Could a brain-computer interface using EEG be a viable communication system?”
He first used EEG for assistive purposes in a research study some five years ago. They were looking into alternative methods of letting a person control an on-screen cursor, among them an accelerometer for detecting head movements, and tried integrating EEG readings as another signal. But it was far from a breakthrough.
A modern lab with an EEG cap wired to a receiver and laptop – this is an example of how EEG is commonly used.
He ran down the difficulties: “With a read-only system, the way EEG is used today is no good; other headsets have slow sample rates and they’re not accurate enough for a real-time interface. The best BCIs are in a lab, connected to wet electrodes — it’s messy, it’s really a non-starter. So how do we replicate that with dry, passive electrodes? We’re trying to solve some very hard engineering problems here.”
The limitations, Forsland and his colleagues found, were not so much with the EEG itself as with the way it was carried out. This type of brain monitoring is meant for diagnosis and study, not real-time feedback. It would be like taking a tractor to a drag race. Not only do EEGs often work with a slow, thorough check of multiple regions of the brain that may last several seconds, but the signal it produces is analyzed by dated statistical methods. So Cognixion started by questioning both practices.
Improving the speed of the scan is more complicated than overclocking the sensors or something. Activity in the brain must be inferred by collecting a certain amount of data. But that data is collected passively, so Forsland tried bringing an active element into it: a rhythmic electric stimulation that is in a way reflected by the brain region, but changed slightly depending on its state — almost like echolocation.
The Cognixion One headset with its dry EEG terminals visible.
They detect these signals with a custom set of six EEG channels in the visual cortex area (up and around the back of your head), and use a machine learning model to interpret the incoming data. Running a convolutional neural network locally on an iPhone — something that wasn’t really possible a couple years ago — the system can not only tease out a signal in short order but make accurate predictions, making for faster and smoother interactions.
The result is sub-second latency with 95-100 percent accuracy in a wireless headset powered by a mobile phone. “The speed, accuracy and reliability are getting to commercial levels — we can match the best in class of the current paradigm of EEGs,” said Forsland.
Dr. William Goldie, a clinical neurologist who has used and studied EEGs and other brain monitoring techniques for decades (and who has been voluntarily helping Cognixion develop and test the headset), offered a positive evaluation of the technology.
“There’s absolutely evidence that brainwave activity responds to thinking patterns in predictable ways,” he noted. This type of stimulation and response was studied years ago. “It was fascinating, but back then it was sort of in the mystery magic world. Now it’s resurfacing with these special techniques and the computerization we have these days. To me it’s an area that’s opening up in a manner that I think clinically could be dramatically effective.”
BCI, meet UI
The first thing Forsland told me was “We’re a UI company.” And indeed even such a step forward in neural interfaces as he later described means little if it can’t be applied to the problem at hand: helping people with severe motor impairment to express themselves quickly and easily.
Sad to say, it’s not hard to imagine improving on the “competition,” things like puff-and-blow tubes and switches that let users laboriously move a cursor right, right a little more, up, up a little more, then click: a letter! Gaze detection is of course a big improvement over this, but it’s not always an option (eyes don’t always work as well as one would like) and the best eye-tracking solutions (like a Tobii Dynavox tablet) aren’t portable.
Why shouldn’t these interfaces be as modern and fluid as any other? The team set about making a UI with this and the capabilities of their next-generation EEG in mind.
Image Credits: Cognixion
Their solution takes bits from the old paradigm and combines them with modern virtual assistants and a radial design that prioritizes quick responses and common needs. It all runs in an app on an iPhone, the display of which is reflected in a visor, acting as a HUD and outward-facing display.
In easy reach of, not to say a single thought but at least a moment’s concentration or a tilt of the head, are everyday questions and responses — yes, no, thank you, etc. Then there are slots to put prepared speech into — names, menu orders, and so on. And then there’s a keyboard with word- and sentence-level prediction that allows common words to be popped in without spelling them out.
“We’ve tested the system with people who rely on switches, who might take 30 minutes to make 2 selections. We put the headset on a person with cerebral palsy, and she typed our her name and hit play in 2 minutes,” Forsland said. “It was ridiculous, everyone was crying.”
Goldie noted that there’s something of a learning curve. “When I put it on, I found that it would recognize patterns and follow through on them, but it also sort of taught patterns to me. You’re training the system, and it’s training you — it’s a feedback loop.”
“I can be the loudest person in the room”
One person who has found it extremely useful is Chris Benedict, a DJ, public speaker, and disability advocate who himself has Dyskinetic Cerebral Palsy. It limits his movements and ability to speak, but doesn’t stop him from spinning (digital) records at various engagements, however, or from explaining his experience with Cognixion’s One headset over email. (And you can see him demonstrating it in person in the video above.)
Image Credits: Cognixion
“Even though it’s not a tool that I’d need all the time it’s definitely helpful in aiding my communication,” he told me. “Especially when I need to respond quickly or am somewhere that is noisy, which happens often when you are a DJ. If I wear it with a Bluetooth speaker I can be the loudest person in the room.” (He always has a speaker on hand, since “you never know when you might need some music.”)
The benefits offered by the headset give some idea of what is lacking from existing assistive technology (and what many people take for granted).
“I can use it to communicate, but at the same time I can make eye contact with the person I’m talking to, because of the visor. I don’t have to stare at a screen between me and someone else. This really helps me connect with people,” Benedict explained.
“Because it’s a headset I don’t have to worry about getting in and out of places, there is no extra bulk added to my chair that I have to worry about getting damaged in a doorway. The headset is balanced too, so it doesn’t make my head lean back or forward or weigh my neck down,” he continued. “When I set it up to use the first time it had me calibrate, and it measured my personal range of motion so the keyboard and choices fit on the screen specifically for me. It can also be recalibrated at any time, which is important because not every day is my range of motion the same.”
Alexa, which has been extremely helpful to people with a variety of disabilities due to its low cost and wide range of compatible devices, is also part of the Cognixion interface, something Benedict appreciates, having himself adopted the system for smart home and other purposes. “With other systems this isn’t something you can do, or if it is an option, it’s really complicated,” he said.
As Benedict demonstrates, there are people for whom a device like Cognixion’s makes a lot of sense, and the hope is it will be embraced as part of the necessarily diverse ecosystem of assistive technology.
Forsland said that the company is working closely with the community, from users to clinical advisors like Goldie and other specialists, like speech therapists, to make the One headset as good as it can be. But the hurdle, as with so many devices in this class, is how to actually put it on people’s heads — financially and logistically speaking.
Cognixion is applying for FDA clearance to get the cost of the headset — which, being powered by a phone, is not as high as it would be with an integrated screen and processor — covered by insurance. But in the meantime the company is working with clinical and corporate labs that are doing neurological and psychological research. Places where you might find an ordinary, cumbersome EEG setup, in other words.
The company has raised funding and is looking for more (hardware development and medical pursuits don’t come cheap), and has also collected a number of grants.
The One headset may still be some years away from wider use (the FDA is never in a hurry), but that allows the company time to refine the device and include new advances. Unlike many other assistive devices, for example a switch or joystick, this one is largely software-limited, meaning better algorithms and UI work will significantly improve it. While many wait for companies like Neuralink to create a brain-computer interface for the modern era, Cognixion has already done so for a group of people who have much more to gain from it.
You can learn more about the Cognixion One headset and sign up to receive the latest at its site here.
Sony and Discord have announced a partnership that will integrate the latter’s popular gaming-focused chat app with PlayStation’s own built-in social tools. It’s a big move and a fairly surprising one given how recently acquisition talks were in the air — Sony appears to have offered a better deal than Microsoft, taking an undisclosed minority stake in the company ahead of a rumored IPO.
The exact nature of the partnership is not expressed in the brief announcement post. The closest we come to hearing what will actually happen is that the two companies plan to “bring the Discord and PlayStation experiences closer together on console and mobile starting early next year,” which at least is easy enough to imagine.
Discord has partnered with console platforms before, though its deal with Microsoft was not a particularly deep integration. This is almost certainly more than a “friends can see what you’re playing on PS5” and more of a “this is an alternative chat infrastructure for anyone on a Sony system.” Chances are it’ll be a deep, system-wide but clearly Discord-branded option — such as “Start a voice chat with Discord” option when you invite a friend to your game or join theirs.
The timeline of early 2022 also suggests that this is a major product change, probably coinciding with a big platform update on Sony’s long-term PS5 roadmap.
While the new PlayStation is better than the old one when it comes to voice chat, the old one wasn’t great to begin with, and Discord is not just easier to use but something millions of gamers already do use daily. And these days, if a game isn’t an exclusive, being robustly cross-platform is the next best option — so PS5 players being able to seamlessly join and chat with PC players will reduce a pain point there.
Of course Microsoft has its own advantages, running both the Xbox and Windows ecosystems, but it has repeatedly fumbled this opportunity and the acquisition of Discord might have been the missing piece that tied it all together. That bird has flown, of course, and while Microsoft’s acquisition talks reportedly valued Discord at some $10 billion, it seems the growing chat app decided it would rather fly free with an IPO and attempt to become the dominant voice platform everywhere rather than become a prized pet.
Sony has done its part, financially speaking, by taking part in Discord’s recent $100 million H round. The amount they contributed is unknown, but perforce it can’t be more than a small minority stake given how much the company has taken on and its total valuation.
Returnal, released today for the PlayStation 5, is an action adventure that has you exploring an alien world that reconfigures itself whenever you die, bringing you back for another shot at escaping. It’s exciting, frustrating, and beautiful, though it isn’t particularly original. But while it is arguably the first game to be released that was designed and built for next generation consoles, it’s not the mainstream hit many gamers are waiting for.
First I should probably justify my “arguably.” The PS5 debuted with the impressive remake of Demon’s Souls, and while I enjoyed that greatly, it was only next-gen in its presentation; many dated aspects faithfully carried over from the original mean it can’t really be considered a fully next generation title. The pack-in Astro’s Playroom is a delight but doesn’t compare with full-scale games. Destruction All-Stars was something of a damp squib. And excellent games like Spider-Man: Miles Morales and Assassin’s Creed: Valhalla span the generations, playing best but by no means exclusively on PS5.
So Returnal really is, in a non-trivial way, the first really “next-gen” PS5 game — and it carries the “next-gen” PS5 price tag of $70, more in many regions. Can it justify this premium? In some ways yes, but like Demon’s Souls this is a difficult game that involves a potentially off-putting amount of repetition and failure for mainstream audiences.
The game starts with your character, a sci-fi space explorer working for a mysterious company called Astra (there are clear nods to Weyland-Yutani from Alien), crashing on a forbidden planet and finding herself — literally — stuck in a sort of time loop. (You’ll see what I mean by literally.)
The developers have obviously seen Prometheus.
Without getting into the specifics of the plot, which is slowly revealed through found recordings, exploration of ancient ruins, and decoding alien symbols, Selene is seemingly trapped on the planet until she can figure out what’s going on, and whenever she dies the world shifts around to provide new challenges and opportunities.
Each loop or “cycle” involves the player starting from the crash site and progressing through the world, different but familiar every time. You encounter enemies, collect power-ups like new weapons or artifacts that affect your abilities, and occasionally an item that will permanently augment your suit or open new paths to take.
To call any individual aspect of the game original would be inaccurate — it takes with a free hand from its august predecessors in both gameplay and presentation. Without spoiling too much I’d say its progression and design share the most with indie breakout hit Dead Cells, with a dose of Risk of Rain 2, and a setting lifted wholesale from elaborating on Alien and Prometheus. That said, the story and backstory owe more to Solaris. It wears its influences on its sleeve to be sure, but they come together as something cohesive, not a sloppy pastiche.
Run, gun, rinse, repeat
Returnal starts out almost frustratingly simple, but this is soon remedied as new abilities and layers of complexity are added to the mix; expect the “tutorial” to be meted out over a few hours as things are discovered organically.
You make your way through what amounts to arena after arena, sometimes large and multi-layered, sometimes confined, and fight whatever appears. Combat is frantic and high risk — monsters don’t telegraph ponderous swipes at you but rather spew dozens or hundreds of bullets in your direction, making you rely on smart anticipatory movement and the cluttered landscape to stay alive. As you defeat them you accrue increasingly powerful boons that only last until you get hit, at which point they all disappear, adding a layer of urgency to every encounter: you could gain a crucial edge for the next miniboss — or lose what you’ve built up over minutes of careful play. You can’t take any enemy lightly — those that don’t kill you will make you weaker.
The player moves forward by exploring and eventually defeating an area boss, encounters that are more than a little taxing and generally take a few tries. Then it’s on to a new, different “biome” to do it all again with a different color scheme (and new enemies, hazards and so on).
The look and feel of Returnal is what you might call “early next-gen.” It’s detailed, interesting looking and realistic in a sci-fi way, and it uses lighting and color well to create both a sense of place and gameplay objectives. It’s better in some ways than what you’d expect from a PS4 or Xbox One game but ultimately the advances here seem to be more on the side of “fewer limitations” rather than “new capabilities.” Load times are practically non-existent — a second or two at most — and in places where sightlines are farther than a room or two, the scale of what’s being drawn is impressive. The framerate is a steady 60, making combat fluid no matter how crowded and chaotic it gets.
As for the claim of a “living world” that’s truly different every time, you can pretty much ignore that. You’ll encounter the same rooms and structures repeatedly, maybe with different enemies or items, but don’t expect a wildly different experience every loop. Just enough that the repetition isn’t too repetitious.
Image Credits: Sony/Housemarque
Sound is solid and I’d definitely recommend headphones. Your number one issue is going to be getting blasted in the back and positional audio will help a lot with that, as each enemy has characteristic noises for its actions.
The PS5 controller’s advanced haptics are put to good use with two-stage virtual triggers and a lot of contextual vibrations. I do wish there was a way to control these with a bit more granularity, as the constant patter of rain in the first area was numbing my hands, but the other haptic cues were useful and quickly became second nature.
Games in the “roguelite” (i.e. you start from scratch every life like a roguelike, but occasionally gain permanent upgrades) genre can fall flat if your progression, either within a loop or over many of them, involves little more than “+4% pistol damage” or a few more hit points. Fortunately Returnal is well aware of this and its weapons, artifacts, perks and so on often confer interesting bonuses or risk/reward mechanics. And you only have one weapon at a time, meaning the choice between, say, an assault rifle with special ability A and a shotgun with special ability B is a complex and risky one.
Eventually you’ll be able to skip past certain areas, but you may not want to, preferring to scour side paths for resources so you’re not going into the next boss room naked and afraid. In general the game manages to keep a lot of interesting tensions going on with the player that make every decision consequential, not to say agonizing.
Next-gen price tag
Is Returnal worth its premium $70 asking price? For some people, yes. But this isn’t the kind of mainstream blockbuster that would ordinarily justify the increased cost.
So far I’ve played about 20 hours, done 30-odd runs, and based on what I know I’m about halfway through the game. Most of my progress was made on what I think of “prestige” runs, the handful where everything goes right and I get much further than before, making them tense and exciting. (Many ended within five minutes due to poor momentum or rage quits.) The game promises replay value past the credits, though, so a guess of 40 hours of content is more of a floor than a ceiling.
One of several trips to your house, inexplicably replicated on the alien planet…
The difficulty may present a barrier to many players. A dialogue at the start of the game warns you that the game is meant to be challenging and that death is part of the journey. Great, but that doesn’t make it any less frustrating when you get ambushed by a dozen enemies, wiping out half an hour of of progress in an instant. While generally the game falls into the “tough but fair” category, there are spikes here and there that feel gratuitous, and loops where you feel unlucky or underpowered and have to fight the urge to reset.
I don’t mind personally — compared to the Dark Souls series it’s a cakewalk. I almost beat the first boss on my first encounter; good luck doing that with Ornstein and Smough or Father Gascoigne! But like those games it takes a certain type of player to want to power through the early hours and access the huge amount of value essentially locked behind repeated failure. Similar to how many Demon’s Souls players never progress past that game’s punishing first area, players not ready for the acrobatics and perseverance necessary to traverse the unforgiving bullet hell of Returnal may never escape its gloomy, restrictive first biome for the bright, open second one or glimpse its intriguing backstory. (I’ve included some tips below to help people get through the first hours.)
Compared with the steady progression and traditional storytelling of something like AC: Valhalla or Miles Morales this may put off less masochistic gamers or prompt more than a few controller-throwing, refund-requesting moments. After all, paying $70 for a game that slaps you in the face while you try to access the latter $50 worth of it can be justifiably frustrating.
With all that said, it is nice to see a AAA next-generation game that isn’t a sequel or franchise, and seeing the “roguelite” formula embraced seriously beyond the indie world. Returnal may not be for everyone, but for the subset of gamers who have embraced this genre for years, it’s an easy one to recommend.
Tips for playing:
If you do decide to dive in, here are a handful of non-spoilery “wish I’d known that” tips to get you on the right track.
There are lots of hidden rooms and items, so check your mini-map frequently for things you’ve missed in the chaos and peek in every nook and cranny.
When you’re undamaged, healing pickups contribute towards adding crucial max health — one more reason to not get hit.
New abilities create new opportunities in old areas — there’s a reason so much of the first biome is inaccessible at first.
You can get to those treasures behind bars. Look around carefully (and shoot everything).
“Malignant” treasure is usually more trouble than it’s worth and the debuffs can sink a run. Only do if desperate or you have a cure handy. (Parasites on the other hand can be very useful.)
There’s always a healing item for sale at the biome’s “shop,” so there’s no excuse for going into a boss room without one. (And self-healing artifacts will save your life five times over.)
Calibri, we hardly knew ye. Microsoft’s default font for all its Office products (and built-in apps like Wordpad) is on its way out and the company now needs your help picking a new one. Let’s judge the options!
You probably don’t think much about Calibri, but that’s a good thing in this context. A default font should be something you don’t notice and don’t feel the need to change unless you want something specific. Of course the switch from Times New Roman back in 2007 was controversial — going from a serif default to a sans serif default ruffled a lot of feathers. Ultimately it proved to be a good decision, and anyway TNR is still usually the default for serif-specified text.
To be clear, this is about defaults for user-created stuff, like Word files. The font used by Microsoft in Windows and other official stuff is Segoe UI, and there are a few other defaults mixed in there as well. But from now on making a new document in an Office product would default to using one of these, and the others will be there as options.
Replacing Calibri with another friendly-looking universal sans serif font will be a considerably less dramatic change than 2007’s, but that doesn’t mean we can’t have opinions on it. Oh no. We’re going to get into it. Unfortunately Microsoft’s only options for seeing the text, apart from writing it out in your own 365 apps, are the tweet (doesn’t have all the letters) or some colorful but not informative graphic presentations. So we (and by we I mean Darrell) made our own little specimen to judge by:
You may notice Grandview is missing. We’ll get to that. Starting from the top:
Calibri, here for reference, is an inoffensive, rather narrow font. It gets its friendly appearance from the tips of the letters, which are buffed off like they were afraid kids might run into them. At low resolutions like we had in 2007 this didn’t really come across, but now it’s more obvious and actually a little weird, making it look a bit like magnetic fridge letters.
Bierstadt is my pick and what I think Microsoft will pick. First because it has a differentiated lowercase l, which I think is important. Second, it doesn’t try anything cute with its terminals. The t ends without curling up, and there’s no distracting tail on the a, among other things — sadly most common letter, lowercase e, is ugly, like a chipped theta. Someone fix it. It’s practical, clear, and doesn’t give you a reason to pick a different font. First place. Congratulations, designer Steve Matteson.
Tenorite is my backup pick, because it’s nice but less practical for a default font. Geometric sans serifs (look at the big fat “dog,” all circles) look great at medium size but small they tend to make for weird, wide spacing. Look at how Bierstadt makes the narrow and wide letters comparable in width, while in Tenorite they’re super uneven, yet both are near the same total length. Also, no, we didn’t mess with the kerning or add extra spaces to the end in “This is Tenorite.” That’s how it came out. Someone fix it! Second Place.
Skeena, apart from sounding like a kind of monster you fight in an RPG, feels like a throwback. Specifically to Monaco, the font we all remember from early versions of MacOS (like System 7). The variable thickness and attenuated tails make for an interesting look in large type, but small it just looks awkward. Best e of the bunch, but something’s wrong with the g, maybe. Someone might need to fix it. Third place.
Seaford is an interesting one, but it’s trying too hard with these angular loops and terminals. The lowercase k and a are horrifying, like broken pretzels. The j looks like someone kicked an i. The d looks like it had too much to eat and is resting its belly on the ground. And don’t get me started on the bent bars of the italic w. Someone fix it. I like the extra strong bold and the g actually works, but this would really bug me to use every day. Fourth Place.
Grandview didn’t render properly for us. It looked like Dingbats in regular, but was fine in bold and italic. Someone fix it. Fortunately I feel confident it won’t be the next default. It’s not bad at all, but it’s inhuman, robotic. Looks like a terminal font no one uses. See how any opportunity there is for a straight line is taken? Nice for a logo — feels strong structurally — but a paragraph of it would look like a barcode. Use it for H2 stuff. Last place.
So what should you “vote” for by tweeting hard at Microsoft? Probably it doesn’t matter. I’m guessing they’ve already picked one. Bierstadt is the smart pick, because it’s good in general while the others are all situational. If they would only fix that damn e.
Sign language is used by millions of people around the world, but unlike Spanish, Mandarin or even Latin, there’s no automatic translation available for those who can’t use it. SLAIT claims the first such tool available for general use, which can translate around 200 words and simple sentences to start — using nothing but an ordinary computer and webcam.
People with hearing impairments, or other conditions that make vocal speech difficult, number in the hundreds of millions, rely on the same common tech tools as the hearing population. But while emails and text chat are useful and of course very common now, they aren’t a replacement for face-to-face communication, and unfortunately there’s no easy way for signing to be turned into written or spoken words, so this remains a significant barrier.
We’ve seen attempts at automatic sign language (usually American/ASL) translation for years and years. In 2012 Microsoft awarded its Imagine Cup to a student team that tracked hand movements with gloves; in 2018 I wrote about SignAll, which has been working on a sign language translation booth using multiple cameras to give 3D positioning; and in 2019 I noted that a new hand-tracking algorithm called MediaPipe, from Google’s AI labs, could lead to advances in sign detection. Turns out that’s more or less exactly what happened.
SLAIT is a startup built out of research done at the Aachen University of Applied Sciences in Germany, where co-founder Antonio Domènech built a small ASL recognition engine using MediaPipe and custom neural networks. Having proved the basic notion, Domènech was joined by co-founders Evgeny Fomin and William Vicars to start the company; they then moved on to building a system that could recognize first 100, and now 200 individual ASL gestures and some simple sentences. The translation occurs offline, and in near real time on any relatively recent phone or computer.
Image Credits: SLAIT
They plan to make it available for educational and development work, expanding their dataset so they can improve the model before attempting any more significant consumer applications.
Of course, the development of the current model was not at all simple, though it was achieved in remarkably little time by a small team. MediaPipe offered an effective, open-source method for tracking hand and finger positions, sure, but the crucial component for any strong machine learning model is data, in this case video data (since it would be interpreting video) of ASL in use — and there simply isn’t a lot of that available.
As they recently explained in a presentation for the DeafIT conference, the first team evaluated using an older Microsoft database, but found that a newer Australian academic database had more and better quality data, allowing for the creation of a model that is 92% accurate at identifying any of 200 signs in real time. They have augmented this with sign language videos from social media (with permission, of course) and government speeches that have sign language interpreters — but they still need more.
A GIF showing one of the prototypes in action — the consumer product won’t have a wireframe, obviously. Image Credits: SLAIT
Their intention is to make the platform available to the deaf and ASL learner communities, who hopefully won’t mind their use of the system being turned to its improvement.
And naturally it could prove an invaluable tool in its present state, since the company’s translation model, even as a work in progress, is still potentially transformative for many people. With the amount of video calls going on these days and likely for the rest of eternity, accessibility is being left behind — only some platforms offer automatic captioning, transcription, summaries, and certainly none recognize sign language. But with SLAIT’s tool people could sign normally and participate in a video call naturally rather than using the neglected chat function.
“In the short term, we’ve proven that 200 word models are accessible and our results are getting better every day,” said SLAIT’s Evgeny Fomin. “In the medium term, we plan to release a consumer facing app to track sign language. However, there is a lot of work to do to reach a comprehensive library of all sign language gestures. We are committed to making this future state a reality. Our mission is to radically improve accessibility for the Deaf and hard-of-hearing communities.”
From left, Evgeny Fomin, Antonio Domènech and Bill Vicars. Image Credits: SLAIT
He cautioned that it will not be totally complete — just as translation and transcription in or to any language is only an approximation, the point is to provide practical results for millions of people, and a few hundred words goes a long way toward doing so. As data pours in, new words can be added to the vocabulary, and new multigesture phrases as well, and performance for the core set will improve.
Right now the company is seeking initial funding to get its prototype out and grow the team beyond the founding crew. Fomin said they have received some interest but want to make sure they connect with an investor who really understands the plan and vision.
When the engine itself has been built up to be more reliable by the addition of more data and the refining of the machine learning models, the team will look into further development and integration of the app with other products and services. For now the product is more of a proof of concept, but what a proof it is — with a bit more work SLAIT will have leapfrogged the industry and provided something that deaf and hearing people both have been wanting for decades.
While the concept of “deepfakes,” or AI-generated synthetic imagery, has been decried primarily in connection with involuntary depictions of people, the technology is dangerous (and interesting) in other ways as well. For instance, researchers have shown that it can be used to manipulate satellite imagery to produce real-looking — but totally fake — overhead maps of cities.
The study, led by Bo Zhao from the University of Washington, is not intended to alarm anyone but rather to show the risks and opportunities involved in applying this rather infamous technology to cartography. In fact their approach has as much in common with “style transfer” techniques — redrawing images in an impressionistic, crayon, and arbitrary other fashions — than with deepfakes as they are commonly understood.
The team trained a machine learning system on satellite images of three different cities: Seattle, nearby Tacoma, and Beijing. Each has its own distinctive look, just as a painter or medium does. For instance, Seattle tends to have larger overhanging greenery and narrower streets, while Beijing is more monochrome and — in the images used for the study — the taller buildings cast long, dark shadows. The system learned to associate details of a street map (like Google or Apple’s) with those of the satellite view.
The resulting machine learning agent, when given a street map, returns a realistic-looking faux satellite image of what that area would look like if it were in any of those cities. In the following image, the map corresponds to the top right satellite image of Tacoma, while the lower versions show how it might look in Seattle and Beijing.
Image Credits: Zhao et al.
A close inspection will show that the fake maps aren’t as sharp as the real one, and there are probably some logical inconsistencies like streets that go nowhere and the like. But at a glance the Seattle and Beijing images are perfectly plausible.
One only has to think for a few minutes to conceive of uses for fake maps like this, both legitimate and otherwise. The researchers suggest that the technique could be used to simulate imagery of places for which no satellite imagery is available — like one of these cities in the days before such things were possible, or for a planned expansion or zoning change. The system doesn’t have to imitate another place altogether — it could be trained on a more densely populated part of the same city, or one with wider streets.
And should technology like this be bent to less constructive purposes, the paper also looks at ways to detect such simulated imagery using careful examination of colors and features.
The work challenges the general assumption of the “absolute reliability of satellite images or other geospatial data,” said Zhao in a UW news article, and certainly as with other media that kind thinking has to go by the wayside as new threats appear. You can read the full paper at the journal Cartography and Geographic Information Science.
Earth imaging is an increasingly crowded space, but Satellite Vu is taking a different approach by focusing on infrared and heat emissions, which are crucial for industry and climate change monitoring. Fresh from TechCrunch’s Startup Battlefield, the company has raised a £3.6M ($5M) seed round and is on its way to launching its first satellite in 2022.
The nuts and bolts of Satellite Vu’s tech and master plan are described in our original profile of the company, but the gist is this: while companies like Planet have made near-real-time views of the Earth’s surface into a thriving business, other niches are relatively unexplored — like thermal imaging.
The heat coming off a building, geological feature, or even a crowd of people is an enormously interesting data point. It can tell you whether an office building or warehouse is in use or empty, and whether it’s heated or cooled, and how efficient that process is. It can find warmer or cooler areas that suggest underground water, power lines, or other heat-affecting objects. It could even make a fair guess at how many people attended a concert, or perhaps an inauguration. And of course it works at night.
You could verify, for instance, which parts of a power plant are active, when.
Pollution and other emissions are also easily spotted and tracked, making infrared observation of the planet an important part of any plan to monitor industry in the context of climate change. That’s what attracted Satellite Vu’s first big piece of cash, a grant from the U.K. government for £1.4M, part of a £500M infrastructure fund.
CEO and founder Anthony Baker said that they began construction of their first satellite with that money, “so we knew we got our sums right,” he said, then began the process of closing additional capital.
Seraphim Capital, a space-focused VC firm whose most relevant venture is probably synthetic aperture satellite startup Iceye, matched the grant funds, and with subsequent grant the total money raised is in excess of the $5M target (the extra is set aside in a convertible note).
“What attracted us to Satellite Vu is several things. We published some research about this last year: there are more than 180 companies with plans to launch smallsat constellations,” said Seraphim managing partner James Bruegger. But very few, they noted, were looking at the infrared or thermal space. “That intrigued us, because we always thought infrared had a lot of potential. And we already knew Anthony and Satellite Vu from having put them through our space accelerator in 2019.”
They’re going to need every penny. Though the satellites themselves are looking to be remarkably cheap, as satellites go — $14-15M all told — and only seven will be needed to provide global coverage, that still adds up to over $100M over the next couple years.
Image Credits: Satellite Vu
Seraphim isn’t daunted, however: “As a specialist space investor, we understand the value of patience,” said Bruegger. Satellite Vu, he added, is a “poster child” for their approach, which is to shuttle early stage companies through their accelerator and then support them to an exit.
It helps that Baker has lined up about as much potential income from interested customers as they’ll need to finance the whole thing, soup to nuts. “Commercial traction has improved since we last spoke,” said Baker, which was just before he presented at TechCrunch’s Disrupt 2020 Startup Battlefield:
The company now has 26 letters of intent and other leads that amount to, in his estimation, about a hundred million dollars worth of business — if he can provide the services they’re asking for, of course. To that end the company has been flying its future orbital cameras on ordinary planes and modifying the output to resemble what they expect from the satellite network.
Companies interested in the latter can buy into the former for now, and the transition to the “real” product should be relatively painless. It also helps create a pipeline on Satellite Vu’s side, so there’s no need for a test satellite and service.
Another example of the simulated satellite imagery – same camera as will be in orbit, but degraded to resemble shots from that far up.
“We call it pseudo-satellite data — it’s almost a minimum viable product.We work with the companies about the formats and stuff they need,” Baker said. “The next stage is, we’re planning on taking a whole city, like Glasgow, and mapping the whole city in thermal. We think there will be many parties interested in that.”
With investment, tentative income, and potential customers lining up, Satellite Vu seems poised to make a splash, though its operations and launches are small compared with those of Planet, Starlink, and very soon Amazon’s Kuiper. After the first launch, tentatively scheduled for 2022, Baker said the company would only need two more to put the remaining six satellites in orbit, three at a time on a rideshare launch vehicle.
Before that, though, we can expect further fundraising, perhaps as soon as a few months from now — after all, however thrifty the company is, tens of millions in cash will still be needed to get off the ground.
One never knows how a confirmation hearing will go these days, especially one for a young outsider nominated to an important position despite challenging the status quo and big business. Lina Khan, just such a person up for the position of FTC Commissioner, had a surprisingly pleasant time of it during today’s Senate Commerce Committee confirmation hearing — possibly because her iconoclastic approach to antitrust makes for good politics these days.
Khan, an associate professor of law at Columbia, is best known in the tech community for her incisive essay “Amazon Antitrust’s Paradox,” which laid out the failings of regulatory doctrine that have allowed the retail giant to progressively dominate more and more markets. (She also recently contributed to a House report on tech policy.)
When it was published, in 2018, the feeling that Amazon had begun to abuse its position was, though commonplace in some circles, not really popular in the Capitol. But the growing sense that laissez-faire or insufficient regulations have created monsters in Amazon, Google, and Facebook (to start) has led to a rare bipartisan agreement that we must find some way, any way will do, of putting these upstart corporations back in their place.
This in turn led to a sense of shared purpose and camaraderie in the confirmation hearing, which was a triple header: Khan joined Bill Nelson, nominated to lead NASA, and Leslie Kiernan, who would join the Commerce Department as General Counsel, for a really nice little three-hour chat.
Khan is one of several in the Biden administration who signal a new approach to taking on Big Tech and other businesses that have gotten out of hand, and the questions posed to her by Senators from both sides of the aisle seemed genuine and got genuinely satisfactory answers from a confident Khan.
She deftly avoided a few attempts to bait her — including one involving Section 230; wrong Commission, Senator — and her answers primarily reaffirmed her professional opinion that the FTC should be better informed and more preemptive in its approach to regulating these secretive, powerful corporations.
Here are a few snippets representative of the questioning and indicative of her positions on a few major issues (answers lightly edited for clarity):
On the FTC getting involved in the fight between Google, Facebook, and news providers:
“Everything needs to be on the table. Obviously local journalism is in crisis, and i think the current COVID moment has really underscored the deep democratic emergency that is resulting when we don’t have reliable sources of local news.”
She also cited the increasing concentration of ad markets and the arbitrary nature of, for example, algorithm changes that can have wide-ranging effects on entire industries.
On Clarence Thomas’s troubling suggestion that social media companies should be considered “common carriers”:
“I think it prompted a lot of interesting discussion,” she said, very diplomatically. “In the Amazon article, I identified two potential pathways forward when thinking about these dominant digital platforms. One is enforcing competition laws and ensuring that these markets are competitive.” (i.e. using antitrust rules)
“The other is, if we instead recognize that perhaps there are certain economies of scale, network externalities that will lead these markets to stay dominated by a very few number of companies, then we need to apply a different set of rules. We have a long legal tradition of thinking about what types of checks can be applied when there’s a lot of concentration and common carriage is one of those tools.”
“I should clarify that some of these firms are now integrated in so many markets that you may reach for a different set of tools depending on which specific market you’re looking at.”
(This was a very polite way of saying common carriage and existing antitrust rules are totally unsuitable for the job.)
On potentially reviewing past mergers the FTC approved:
“The resources of the commission have not really kept pace with the increasing size of the economy, as well as the increasing size and complexity of the deals the commission is reviewing.”
“There was an assumption that digital markets in particular are fast moving so we don’t need to be concerned about potential concentration in the markets, because any exercise of power will get disciplined by entry and new competition. Now of course we know that in the markets you actually have significant network externalities in ways that make them more sticky. In hindsight there’s a growing sense that those merger reviews were a missed opportunity.”
(Here Senator Blackburn (R-TN) in one of the few negative moments fretted about Khan’s “lack of experience in coming to that position” before asking about a spectrum plan — wrong Commission, Senator.)
On the difficulty of enforcing something like an order against Facebook:
“One of the challenges is the deep information asymmetry that exists between some of these firms and enforcers and regulators. I think it’s clear that in some instances the agencies have been a little slow to catch up to the underlying business realities and the empirical realities of how these markets work. So at the very least ensuring the agencies are doing everything they can to keep pace is gonna be important.”
“In social media we have these black box algorithms, proprietary algorithms that can sometimes make it difficult to know what’s really going on. The FTC needs to be using its information gathering capacities to mitigate some of these gaps.”
On extending protections for children and other vulnerable groups online:
Some of these dangers are heightened given some of the ways in which the pandemic has rendered families and children especially dependent on some of these [education] technologies. So I think we need to be especially vigilant here. The previous rules should be the floor, not the ceiling.
Overall there was little partisan bickering and a lot of feeling from both sides that Khan was, if not technically experienced at the job (not rare with a coveted position like FTC Commissioner), about as competent a nominee as anyone could ask for. Not only that but her highly considered and fairly assertive positions on matters of antitrust and competition could help put Amazon and Google, already in the regulatory doghouse, on the defensive for once.
We all know these constant video calls are doing something to our brains. How else could we get tired and frazzled from sitting around in your own home all day? Well, now Microsoft has done a little brain science and found out that yeah, constant video calls do increase your stress and brain noise. Tell your boss!
The study had 14 people participate in eight half-hour video calls, divided into four a day — one day with ten-minute breaks between, and the other all in one block. The participants wore EEG caps: brain-monitoring gear that gives a general idea of types of activity in the old grey matter.
What they found is not particularly surprising, since we all have lived it for the last year (or more for already remote workers), but still important to show in testing. During the meeting block with no breaks, people showed higher levels of beta waves, which are associated with stress, anxiety, and concentration. There were higher peaks and a higher average stress level, plus it increased slowly as time went on.
Taking ten-minute breaks kept stress readings lower on average and prevented them from rising. And they increased other measurements of positive engagement.
Image Credits: Microsoft/Valerio Pellegrini
It’s certainly validating even if it seems obvious. And while EEG readings aren’t the most exact measurement of stress, they’re fairly reliable and better than a retrospective self-evaluation along the lines of “How stressed were you after the second meeting on a scale of 1-5?” And of course it wouldn’t be safe to take your laptop into an MRI machine. So while this evidence is helpful, we should be careful not to exaggerate it, or forget that the stress takes place in a complex and sometimes inequitable work environment.
For instance: A recent study published by Stanford shows that “Zoom Fatigue,” as they call it (a mixed blessing for Zoom), is disproportionately suffered by women. More than twice as many women as men reported serious post-call exhaustion — perhaps because women’s meetings tend to run longer and they are less likely to take breaks between them. Add to that the increased focus on women’s appearance and it’s clear this is not a simple “no one likes video calls” situation.
Microsoft, naturally, has tech solutions to the problems in its Teams product, such as adding buffer time to make sure meetings don’t run right into each other, or the slightly weird “together mode” that puts everyone’s heads in a sort of lecture hall (the idea being it feels more natural).
Stanford has a few recommendations, such as giving yourself permission to do audio only for a while each day, position the camera far away and pace around (make sure you’re dressed), or just turn off the self-view.
Ultimately the solutions can’t be entirely individual, though — they need to be structural, and though we may be leaving the year of virtual meetings behind, there can be no doubt there will be more of them going forward. So employers and organizers need to be cognizant of these risks and create policies that mitigate them — don’t just add to employee responsibilities. If anyone asks, tell them science said so.