Cerebro Voice is a silent speech communications device that was developed by a team at 42 that works within the Robotics Lab. The team is led by Dan Goncharov, the head of 42 Robotics and the proof of concept was started about 6 months ago. The device collects electrical signals directly from the vocal cords and other muscles involved in voice articulation to produce silent speech recognition via machine learning. Silent speech recognition is also known as subvocalization recognition, which is the ability to recognize words without the person needing to say the words out loud. This was first seen back in 2004 with research conducted by NASA Ames Laboratory. They named this technology “synthetic telepathy.” Cerebro Voice addresses problematic areas such as the rise of personal AI assistants that solely use voice recognition such as Google Assistant, Alexa, Siri. Imagine if we could have these conversations privately in silence like with Siri. Collecting non-audio signals as another stream of data when performing voice recognition tasks can help mitigate issues such as hands-free operation, speaking is 3x faster than average texting, and safer driving while trying to communicate via your smartphone. Other benefits include speech recognition accuracy, it’s hard to isolate a person’s voice in a noisy environment, privacy, anything you ask Siri can be heard by other people too, and personal identification authentication where AI can help with things. Silent speech technology can also be used for people who need it most, such as people who have lost their ability to speak due to medical reasons. I sat down with current Robotics team member Taylor Yang and former Robotics team member Annie Ho who both have worked on the Cerebro Voice project. They recently presented at Hackaday where Cerebro Voice was one of 20 finalists for the human-computer interface challenge.
What inspired Cerebro Voice?
Taylor: We finished up on Mycotronics (Our smart incubator project) and soon after Dan initiated the project, and I was inspired to make a real device to capture silent speech. That is how we started the machine learning piscine, and eventually, we got to the subvocalization recognition project where we try to record data using EMG. Based on that data we wanted to see if we could train a model to differentiate between words.
Annie: The fact that a lot of people don’t understand that subvocalization is a phenomenon that comes from when you start to learn how to read. You start to pronounce the words out loud but even when you start to read silently we can still record the EMG signals that are already there. It makes sense there is a tie between reading and speaking, it is one of those things that we don’t realize even exists. It makes you wonder what other biometric signs can be measured that we aren’t aware of already.
What type of practical uses does Cerebro Voice have?
Taylor: The most practical use, since we want to focus on the market where this tech is most needed, would be for medical devices. This could be used for people with laryngectomies, or who have a hard time talking. We can recognize what they are trying to say without them opening their mouth, that is a good way for our product to be utilized and it demonstrates what an impact it can have.
Annie: You want to bring this project to the people who need it most, it is how we would want to validate this technology. For a bigger consumer base, they are picky and need a polished product. Those who are willing to take on a more unrefined product are people who really need it.
What work are you doing or have you done on Cerebro Voice?
Annie: For me, I sort of was there from the beginning, the original project evolved from the very first machine learning piscine. We were playing with brain waves and trying to understand EEG signals. EMG waves are a thousand times stronger, so we came across the whole phenomenon of subvocalization. It caught Dan’s attention, and we were like, “what can we do with this?” So subvocalization became part of the 2nd machine learning piscine. I saw the raw process of learning about subvocalization to a full-fledged project. I did whatever the project needed, and learned about machine learning. The first machine learning piscine was a month, the second was 6 weeks. It is a tough subject, at least for me, I am not done learning it and I still need to revisit everything. Cerebro Voice gave a directed goal, it was cool to see us gather the data for the project. During the first piscine we used data that was already available. With subvocalization, there isn’t a lot on the market, we had to build this thing for ourselves. Dan came up with the stationary data acquisition ring, added 3D printing parts, and we hacked it together in a high-end audio amplifier to capture the signals. I helped with some of the 3D printing, I really got an understanding of hardware and what goes into a hardware project. I had to start reviewing physics from high school. I didn’t realize that getting my hands dirty on a project, and combining hardware and software with 3D modeling, is such an important skill to have. Machine learning is a lot more accessible now, it is so much more applicable now, like data science.
Taylor: We were doing the Google Developers Machine Learning Crash Course, it makes it easy to understand machine learning. At first, I was working on the design. We switched roles every now and then because I have a background in industrial design and worked on the casing and the 3D model as well. I was setting up the demo video that we filmed for the various VCs (Venture Capitalists). I basically had the whole set running, I wasn’t working on the stationary rig at first. I had to collect my own data, run it through the model, so I became familiar with the whole experiment. Right now I am working on the new mobile device since the stationary rig is so big it isn’t useful for everyday usage. We want to use a smaller chip, so that is what I am working on right now. I really like the design aspect of the project, but we have to finish up on the working prototype first. I am not familiar with circuits and electrical engineering, so that’s what I am trying to learn right now, trying to figure out what a capacitor is and a resistor, small things like that. I still have fun learning, so my goal is to be able to build a product by myself, a full stack engineer in hardware and software. It is weird because the first time I came to the lab someone asked if anyone wanted to work on the website and I said yes. So I went down this rabbit hole, machine learning, scripting, and hardware, it is crazy.
What has been the most difficult part of working on Cerebro Voice?
Taylor: The hardest part for me is the hardware development, knowing what we want and actually making it work. Different circuit boards have different drivers and different manufacturers, and they program them differently. What I am struggling on is when you have a part that isn’t popular, there aren’t as much documentation, so you just have to experiment and see what works.
Annie: Now looking back, I have an idea of how things work, but in the learning process I had no idea what was going on. After understanding the fundamentals, I needed to put together this basic circuit. I needed to learn how this relates to future hardware development, a lot of days it felt like it was going nowhere.
What do you enjoy most about Cerebro Voice?
Taylor: Being able to work on different aspects of the project is very fun for me. Working on software and hardware, visiting VCs, seeing how they operate and learning what they are looking for is very valuable for me. So far we went to a VC in San Francisco and we talked to them and gained an understanding of what we lack. They really ask the critical questions of what they are looking for in a startup, so that is really valuable. Meeting people at conferences, talking about what they are doing and sharing their work is exciting as well. Before I would go to conferences but I didn’t have anything to show and I was just a bystander. So I didn’t have anything to talk about, but now I have something to talk about and share which is great. We just went to Hackaday where we were able to talk to people about Cerebro Voice and it was cool. People get excited when they hear about new tech they haven’t tried before. It makes you feel happy you are doing this weird thing and gives you inspiration.
Annie: It is pretty much the idea of building a future, and asking myself why did I want to become a software engineer in the first place? I wanted to build things that have an impact on people, and I like the idea of working on technology that helps people who are really in need. Making people’s lives better drew me to it, I thought the project was really awesome.
What have you learned from working on this project?
Taylor: I learned that I don’t learn fast enough (laughs).
Annie: I’m coming away from it knowing what it means to do machine learning, what it means to work on a hardware project. I don’t know how I would ever learn these things without working on a project. Having a mentor who can guide you during the process is important.
What future do you see for Cerebro Voice?
Taylor: As of today we are still pushing for our first mobile device. Hopefully, when that comes out, more people will be interested in it. What we lack right now is a device that can really deliver what we are trying to do. We need more data, we need a good device, we want to have a small pool of early adopters who are willing to use our prototype so they can give us data and feedback. Also, they can show investors there are people who are willing to use this product. Dan said the hardware aspect isn’t really that hard, because what it does is collect the EMG data from your body. What is more important is the data that we need to train the model to use it for other people. There is still a way to go but we will work hard on it!
Annie: Even though I am off of it now, I hope the project continues. One of the big things that needs to happen is to produce a peer-reviewed academic paper, that is important for any new technology. Dan has this strong belief that this technology is going to be here within the next five years, so much of the technology that we have these days was like the wild west five years ago, you continue to learn and stick around long enough you are further along. Relating it back to becoming a software engineer, I went back to things I enjoyed as a kid. I enjoy building websites, but I ended up going into communications and it took time to come back to it.