Blog

Viking 13B: Scaling Nordic AI models using an open source training framework for LUMI

Viking 13 B

Together with University of Turku’s research group TurkuNLP and HPLT, Europe’s largest private AI lab Silo AI is releasing a new, larger multilingual model natively trained in all Nordic languages, Viking 13B. While showcasing the ability to scale LLM training to thousands of nodes with a new customized open source training framework for LUMI, it’s also continued evidence of the novel approach to training LLMs for low-resource languages.

Following the completions of the language models Poro 34B and Viking 7B, Silo AI and TurkuNLP of University of Turku are now releasing the full 13 billion parameter version of Viking.

In addition to the Nordic languages, the Viking model family also covers English and programming languages. Focusing on low-resource languages without compromising English, Viking 13B includes Danish, Finnish, Norwegian, Icelandic, Swedish and programming languages. The model family comes with an updated, and more modern architecture, and in a variety of model sizes, of which this is the second.

TurkuNLP and Silo AI's collaboration focuses on developing models that excel in linguistic performance and inclusivity while respecting local values and cultures. This effort aims to democratize access to LLMs, thus enhancing Europe's digital infrastructure and innovation ecosystem. By doing so, the initiative seeks to accelerate the adoption of LLM-driven products and applications.

Viking is trained on LUMI, the most powerful supercomputer in Europe. Silo AI and TurkuNLP have shown evidence of training on AMD at scale, with scaling experiments of theoretical throughput predictions utilizing up to 4096 MI-250X GPUs simultaneously.  With this new Viking family, they are training models using a new customized open source training framework, with simultaneous utilization of up to 1024 MI-250X GPUs, proving the ability to train LLMs at scale.

Viking 13B: A modern architecture with more languages

Below is a summary of key features of the Viking model family covering English, Finnish, Swedish, Norwegian, Danish, Icelandic and code. For transparency with respect to model architecture, data and other technical information, please refer to the official model card (Viking 7B, Viking 13B, Viking 33B).

  • Research Checkpoints: Silo AI and TurkuNLP are committed to publishing checkpoints throughout the training process, providing transparency on the model training process.
  • Model architecture: Viking 13B uses an architecture similar to Llama 2, with flash attention, rotary embeddings, grouped query attention and supports a 4k sequence length
  • Model sizes: 13B parameters
  • Multilingual capabilities: The model is designed to process English and Nordic languages, and have proficiency with a variety of programming languages. Additionally, they can perform basic translation between English and Nordic languages.
  • Dataset: The model is trained with a dataset of 2 trillion tokens, including Danish, English, Finnish, Icelandic, Norwegian, Swedish and a variety of programming languages represented.
  • Open source: The model is freely available under the Apache 2.0 License, implying applicability for both commercial and research use.
  • Training hardware: The model is trained using up to 1024 AMD MI250X GPUs on the LUMI supercomputer in Finland.

Considerations for Use

The intended audience for Viking research checkpoints is academic and industry research. These checkpoints are not suitable for deployment in a production use case without further training, fine-tuning and testing.

Acknowledgments

We wish to thank the operators of the LUMI/EuroHPC supercomputer for computational resources and technical support, including AMD, HPE and CSC – the IT Center for Science, Finland. TurkuNLP researchers have received funding from the European Union’s Horizon Europe research and innovation programme High Performance Language Technologies (HPLT) under grant agreement No 101070350.

About

TurkuNLP

The TurkuNLP Group is a group of researchers at the University of Turku, with a research focus on various aspects of natural language processing, language technology and digital linguistics. TurkuNLP has contributed to a large number of open source NLP resources, such as FinBERT, WikiBERT, FinGPT, Turku Dependency Treebank, Universal Dependencies, Turku Neural Parsing Pipeline, Large internet corpora, Turku Paraphrase Corpus, Turku Sentiment Corpus, Wikidata normalization, TurkuONE etc. The University of Turku is an international academic community of 25,000 students and staff and was ranked among the 301–400 best universities in the 2023 Shanghai Ranking.

University of Turku

The University of Turku is an inspiring and international academic community of 25,000 students and staff in Southwest Finland. We build a sustainable future with multidisciplinary research, education, and collaboration. The University of Turku was ranked among the 301–400 best universities in the 2023 Academic Ranking of World Universities, or the so-called Shanghai Ranking. Among Finnish universities, the University of Turku ranked in a shared position of 2nd-3rd.
www.utu.fi

Silo AI

Silo AI is Europe’s largest private AI lab on a mission to ensure Europe has a flagship AI company. We’re a trusted AI partner that brings competitive advantage to product R&D. We build AI-driven solutions and products to enable smart devices, autonomous vehicles, industry 4.0, and smart cities. Silo AI provides its customers unique access to world-class AI models and expertise, as well as the SiloGen model-as-a-service platform. As part of SiloGen, Silo AI is currently building market leading open source LLMst to ensure European digital sovereignty and democratize access to LLMs.
www.silo.ai

Want to discuss how Silo AI could help your organization?

Get in touch with our AI experts.
Peter Sarlin, PhD
Co-founder
Author
Authors

Share on Social
Subscribe to our newsletter

Join the 5000+ subscribers who read the Silo AI monthly newsletter to be among the first to hear about the latest insights, articles, podcast episodes, webinars, and more.

By submitting this form you agree to the processing of your personal data by Silo AI as described in the Privacy Policy.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

What to read next

Ready to level up your AI capabilities?

Succeeding in AI requires a commitment to long-term product development. Let’s start today.