ENGLISH
    • 한국어
    • ENGLISH
    • 日本語
    • 中文-繁體

    2023.10.12 We Play

    Audio AI Application Technology Team

    Joyful Research! Experts Exploring Sound

    Previously, we highlighted NC's cultural unity through diversity by introducing the TAD team in the first episode of the WE PLAY series, and we also showcased the newly hired employees of 2023 in the second episode. In our third episode of WE PLAY, we are pleased to introduce the Audio AI Application Technology Team, which embodies NC's culture of enjoying work.

    The Audio AI Application Technology Team consists of experts from various research backgrounds. They have created a comfortable environment for each expert to maximize their expertise by assigning projects based on preferences and promoting a collaborative, enthusiastic atmosphere.

    Gathering of Experts Elevating the Potential of Sound

    The Audio AI Application Technology Team focuses on human sensory perception, particularly in the auditory senses. In other words, they work on recognizing and generating sound signals, extracting valuable information, and improving overall sound quality.

    Auditory technology is especially crucial in the development of the "Digital Human." The team currently assigns one researcher per project, ensuring a specialized approach.

    Projects are categorized into Keyword Spotting, Multimodal Emotion Recognition, Target Speaker Extraction, and De-noise & De-reverberation. These tasks involve recognizing wake words for Digital Humans, understanding users' emotions, extracting individual voices from mixed sources, and eliminating environmental noise.

    The team is planning to concentrate on two major projects in the future: Sound Event Detection and Sound Effect Generation, which involve recognizing sounds occurring in the environment.

    In summary, the team is dedicated to exploring the possibilities of sound.

    Satisfactory Work Environment Boosts Efficiency

    Project Assignments That Respect Preferences

    Projects in the Audio AI Application Technology Team typically span over a year or more. Therefore, it's crucial to determine project assignments when initiating one. In this regard, the team allocates projects while considering the personal preferences of researchers.

    When initiating a project, team members express their interest in research topics and collectively decide on project assignments through discussions. This approach is founded on the belief that individuals who work on projects they are passionate about, rather than tasks assigned to them, harness the power of passion, curiosity, and enjoyment. In fact, when team members get to work on projects they genuinely desire, their enthusiasm for research increases. Moreover, since they have chosen their tasks themselves, they feel a greater sense of responsibility and can devote more focus to them. It's about creating a satisfying work environment where everyone contributes to an atmosphere of work enjoyment.

    Culture of Finding Answers Together

    As a research team, problem-solving skills are essential for researchers. The way they solve problems is surprisingly simple: they freely turn to each other for help and ask questions.

    This work culture is only possible thanks to an atmosphere where there is no shame in saying, "I don't know." When there are parts that are not well understood or moments of uncertainty during work, researchers comfortably ask questions and provide answers to each other, collaborating to find solutions.

    Culture That Boosts Expertise and Efficiency

    Once a week, they have dedicated time to share their individual work progress. Each person takes turns presenting for that week, and the content is more detailed than typical reporting meetings. During these sessions, the team freely discusses opinions on issues and exchanges ideas, enabling them to make quicker and more accurate progress toward their goals.

    Meanwhile, the presentation materials from these sharing meetings are accumulated over a year. This is done to encourage the natural development of individual research portfolios by the end of the year, which has the effect of reducing the workload for evaluations and increasing overall work efficiency.

    Enjoyable Research Leading to Achievements

    Certified as World-Class Technology

    Visible achievements are certainly noteworthy. In the field of audio technology, the team had three papers accepted at the internationally prestigious 2023 INTERSPEECH conference. This official recognition of the team's technology as world-class is a very special achievement for the newly formed team of five researchers.

    The team operates a period called the "Paper Season" around the end and beginning of the year when projects are nearing completion. During this time, they provide opportunities for each team member's research accomplishments to be officially recognized, fostering a more enjoyable research environment.

    Team leader Cho Namhyun commented on this achievement, saying, "It's rewarding to see capable colleagues proving their worth. It provides a sense of accomplishment and reassurance in a team culture where research is enjoyable, and it's followed by good results."

    Edging Closer to Real Humans

    Among various audio-related technologies, the Audio AI Application Technology Team primarily focuses on auditory technology research, with the specific aim of making "Digital Humans" closely resemble real humans.

    Their key research topics, including keyword spotting technology, multimodal emotion recognition, specific target speaker voice extraction, and distance-based voice separation technology, all hold significant research value. Voice extraction technology, in particular, is expected to address one of the most challenging problems in voice recognition technology known as the "cocktail party effect."

    *Cocktail Party Effect: This term originates from the phenomenon where partygoers can selectively tune in to a single conversation amid surrounding noise at a party. It refers to the psychological phenomenon of selectively focusing on meaningful information while disregarding the surrounding environment.

    Contributing to Intellectual Property through Patent Preparation

    The team also possesses technology with international patents filed, known as "Zero-shot keyword spotting (ZKWS)," which serves as the initial step in conversing with a Digital Human. This technology is gaining attention because it doesn't require additional training for the creation of keyword models. Remarkably, ZKWS outperformed similar technologies from global tech giants released at the same time, despite being 6.5 times smaller in model size.

    ZKWS technology can find applications not only in interactions with Digital Humans but also in targeting various characters within a game screen using voice. Additionally, it is expected to be useful in language learning platforms by leveraging phoneme-level alignment scores that it outputs.

    Challenge Toward the Completion of Auditory Intelligence

    The Audio AI Application Technology Team dreams of creating a Digital Human with hearing abilities close to that of a real person. To achieve this, while they have conducted research related to human speech signals in the past, there are now plans to expand their research to encompass all sound signals, including speech, in the future.

    As a first step, the Audio AI Application Technology Team is currently researching the recognition of hundreds of different sound events. In conjunction with this research, they are also planning to explore generative AI that can estimate the source location of acoustic events or directly generate acoustic events. With the application of these technologies, Digital Humans can possess human-like hearing capabilities. For instance, they can detect emergencies through sounds or inquire about the sound of a baby crying from users. Furthermore, by researching generative AI capable of producing sounds according to user intentions, they are preparing to make a significant leap toward a more human-like Digital Human that can support users in creative activities.

    Until the true auditory intelligence of the Digital Human is realized, the enjoyable research of the Audio AI Application Technology Team will continue.