In online meetings, it’s easy to keep people from talking over each other. Someone just hits the mute (静音) button. But for the most part, this ability doesn’t translate easily to recording in-person meetings. In a cafe, there are no buttons to silence the table beside you.
The ability to locate and manage sound —separating one person talking from a specific location in a crowded room, for example — has challenged researchers, especially without the help of cameras.
A team led by researchers at the University of Washington has developed a shape-changing smart speaker, which can divide rooms into speech areas and track the position s of individual speakers. With the help of the team’s deep-learning AI model, the system lets users mute certain areas or separate simultaneous (同时的) conversations, even if two people have similar voices. In a room meeting, such a system might be used instead of a central microphone, allowing better control of in-room sound.
“If I close my eyes and there are10 people talking in a room, I have no idea who’s saying what and where they are in the room exactly. That’s very intractable for the human brain to process. Until now, it’s also been hard for technology,” said co-lead author Malek Itani. “For the first time, we’re able to track the positions of different people talking in a room and separate their speech.” Early research has required using overhead cameras, projectors or special surfaces. The new system is the first to use only sound.
Instead of processing the sound in the cloud, as most smart speakers do, the new system processes all the sound locally. And even though some people’s first thoughts may be about observation, the system can be used for the opposite, the team says.
“It can actually benefit privacy, beyond what current smart speakers allow,” Itani said. “I can say, ‘Don’t record anything around my desk,’ and our system will create a bubble 3 feet around me. Nothing in this bubble would be recorded.”
【小题1】What did the research team focus on?A.Allowing real-time communication by AI. |
B.Developing Al-powered language models. |
C.Lowering the background noise of conversations. |
D.Tracking and controlling sound in crowded settings. |
A.Dangerous. | B.Natural. |
C.Difficult. | D.Necessary. |
A.Educational. | B.Influential. |
C.Pioneering. | D.Costly. |
A.It records nearby conversations. |
B.It offers improved privacy protection. |
C.It deadens the noise in a particular space. |
D.It includes simultaneous translation service. |