    2023.08.17 R&D

    NC Unveils VARCO LLM, the First Self-Developed Language Model

    We are thrilled to introduce our first in-house developed language model called VARCO LLM. This innovative language model will be introduced in three different scales: small, medium, and large. The initial release will encompass models with parameter sizes of 1.3 billion, 6.4 billion, and an astounding 13 billion. This article aims to delve into the distinctive features and overarching vision of these models. We are optimistic that the myriad opportunities offered by this new language model will strike a chord with a diverse audience.

    VARCO, Via AI, Realize your Creativity and Originality

    Recently, the primary concern for global companies has been how to effectively leverage generative AI for profit generation. NC has consistently researched AI and NLP fields, so while we welcome this trend of AI, the direction of our concerns was different. NC certainly believe that artificial intelligence held immense promise not only in facilitating various modes of communication but also in fostering creativity across different spheres. Recognizing its potential to enhance productivity, NC honed in on its primary objectives: the development of digital personas and the advancement of game design. Realizing that large language models are a key to achieving these goals, NC promptly embarked on the development journey.

    NC sees artificial intelligence as a game-changer in how humans and AI work together, especially in creative tasks. This means making complicated things easier and automating repetitive task, thereby affording individuals the opportunity to fully tap into their creative capacities. NC's AI language model is named VARCO to show this idea. VARCO stands for "Via AI, Realize your Creativity and Originality," which means using AI to be more creative.

    NC's aspiration is to guide human focus towards refining the very essence of creation, collaborating seamlessly with AI. To actualize this, NC is charting a course to establish diverse AI research ventures and enterprises under the “VARCO” brand. Commencing with this announcement, NC is poised to unveil language models spanning small, medium, and large scales this year, while also actively engaging in technological partnerships. Moreover, the culmination of efforts is marked by the completion of the AI platform service “VARCO Studio,” primed for comprehensive integration across all game production processes, including art creation and scenario writing.

    VARCO LLM Roadmap

    VARCO possesses the capacity not only to engage in collaborative creation but also to introduce novel gaming experiences. Through its integration with digital human creation technology, it holds the potential to impart a heightened sense of immersion, bringing interactions to a new level. In accordance with its objectives and strategic vision, NC is poised to fashion tailor-made models, aligning foundational technologies to introduce various services.

    The First VARCO LLM, a Language Model Tailored to Both Cost-effectiveness and Practicality

    VARCO LLM stands as a high-performance language model trained by NC using top-tier pre-training data. The VARCO language model is broadly classified into four distinct categories: ▲ Foundation Model, ▲ Instruction Prompt Model, ▲ Conversational Model, and ▲ Generative Model. This classification facilitates a performance distinction that aligns with parameter scale.

    The language models being introduced this time are the personalized model, endowed with parameters of 1.3 billion, 6.4 billion, and 13 billion. This model exists in two variants: the foundational model, that serves as the bedrock, and the instruction model, that offers the adaptability to be finely tuned with prompts for specific applications.

    Conventionally, heightened parameter counts in a language model correlate with enhanced performance; however, this also entails escalated operational expenditures. The smaller and medium-sized models, featuring parameter counts below 13 billion, circumvent the need for supercomputing capabilities, rendering them cost-effective due to their operational feasibility on a single GPU server. Moreover, their training requires less computational resources, rendering them suitable for creating customized models to suit specialized language domains for various services.

    The First Domestic LLM to Enter Amazon Web Services Marketplace

    VARCO LLM marks a significant milestone as it becomes the first domestic LLM to be made available through Amazon Web Services (AWS) SageMaker JumpStart. Customers can purchase the necessary infrastructure to use the model and utilize it on AWS. AWS SageMaker JumpStart stands as a fully managed machine learning service, curated by AWS to furnish an integrated development environment tailored for streamlined creation, training, and distribution of machine learning models. While it is not open source, VARCO LLM's collaboration with AWS extends a global invitation, enabling users worldwide to engage in a monthly free trial anytime and from any location. Notably, NC maintains an advantageous stance concerning customer data security, as it has exclusive control over its own AWS infrastructure model.

    NCSOFT’s page on Amazon SageMaker JumpStart

    NC is gearing up for strategic collaborations with multiple companies, aiming to yield high-value results by harnessing the potential of NC's language model. The spotlight is particularly on vertical sectors, each characterized by distinct attributes and requisites. This uniqueness renders swift and precise application of generalized language models challenging. To address this, NC is working on specialized AI solutions tailored to the specifics of each domain, aligned with market demands.

    In this vein, NC's language model is poised to evolve into a versatile tool for external companies, helping them achieve seamless global expansion. VARCO LLM, equipped with models of comparable size, has undergone refined training, bolstering its prowess in conversation and generation. This enhancement paves the way for swift AI integration across various domains without significant costs. Furthermore, the comprehensive management of all stages, including data collection, pre-training, fine-tuning, and more, offers the advantage of accurate projection of model outcomes in diverse scenarios. For instance, should any issues arise when the model is used, the ability to recalibrate or review the entirety of trained data facilitates effective control. This meticulous oversight extends its utility when integrating with diverse services in the future.

    Bilingual Models and Ethics Engines

    NC's language model is a bilingual system designed to handle both Korean and English language processing. This unique model offers advantages like cost savings compared to using separate models for each language and swift application across various fields. By training on high-quality Korean data from collaborations with domestic universities and existing English data, NC's model has achieved proficiency in both languages. This proficiency allows for smooth language interactions, translations, and multilingual services.

    Pre-training Data

    NC places a strong emphasis on AI ethics when it comes to its language model. The company has adopted stringent criteria, particularly in data collection. Even if content was publicly available on the internet, NC exercised careful discernment, exclusively selecting websites verified through multiple channels for training. Additionally, NC developed a dedicated AI ethics engine to rigorously filter data. This engine not only removes personal information or biased content but also conducts sentence-level checks to exclude impolite expressions, promotional material, and more, going beyond the realm of ethical concerns.

    Using AI Creatively

    VARCO is a specialized model designed to generate top-tier content for game development. Its capabilities hold the potential to substantially boost efficiency across different aspects of game development, including planning, operations, and art. VARCO's training data primarily revolves around content related to in-game text and scenarios. It autonomously translated copyrighted books from diverse nations and assembled an array of persona-driven dialogue data. Proficient in absorbing immersive, intricate, and coherent dialogue information, VARCO stands as a dynamic tool for producing engaging game content.

    “Varco Studio” is an AI platform service created based on VARCO LLM. It is divided into ▲ image generation tool (VARCO-Art), ▲ text generation and management tool (VARCO-Text), and ▲ digital human creation, editing, and operation tool (VARCO-Human)

    “Varco Studio” is an AI platform service created based on VARCO LLM. “VARCO Art” is a web-based image generation AI tool specialized for NC’s IPs. After successfully navigating internal testing in July, efforts are now concentrated on refining its functionality for pragmatic integration into game development. “VARCO Text” serves as a versatile tool centered around text creation and management, grounded in the VARCO language model. This tool simplifies the generation and organization of crucial elements like scenarios, world-building, and character profiles. While its initial scope targets game-related content, future plans include expanding its utility to encompass a wider range of documents, including general communication like emails. “VARCO Human” is an integrated tool that enables the simultaneous management of the creation, editing, and operation of digital humans. This tool is set to incorporate expert personas, enriched by a spectrum of specialized knowledge cultivated through the VARCO language model. NC's roadmap involves the official launch of all three generative AI services for in-house developers this year.

    NC's language model holds vast potential for widespread application beyond the realm of game content, permeating diverse sectors like automotive platforms, education, finance, biotech, and more. In a significant step last July, NC joined forces with startup DRIMAES to forge a “Vehicle AI News Solution.” This collaborative effort involving DRIMAES, Yonhap News, and NC aims to harness AI to deliver personalized news to drivers in real time. This innovation enables drivers to effortlessly access news that aligns with their interests while remaining attentive to the road. Additionally, NC has initiated a partnership with the Aviation Meteorological Office (AMO) to incorporate generative AI technology in crafting aviation weather information. By utilizing observed and forecasted data for each airport, NC's language model-driven generative AI creates precise, easily understandable sentences akin to how humans swiftly comprehend information.

    Employing the distinctive prowess of NC's language model across all phases facilitates the creation of tailor-made solutions, contributing to the advancement of cutting-edge technologies within the gaming industry. Moreover, NC's language model promises a unique level of ingenuity in various creative domains, setting it apart from conventional, generalized creative AI. The revolution catalyzed by NC's language model is only just beginning.