Silo AI releases the first checkpoints of Viking, an open LLM for all Nordic languages, English and programming languages.

Viking 7B/13B/33B: Sailing the Nordic seas of multilinguality

Large language models

Generative AI

SiloGen

last updated

4.4.2024

Europe’s largest private AI lab, Silo AI, is committed to strengthening European digital sovereignty and democratizing access to European large language models (LLMs). Together with University of Turku’s research group TurkuNLP and HPLT, Silo AI releases the first checkpoints of Viking, an open LLM for all Nordic languages, English and programming languages. Viking provides models that are not only linguistically performant and inclusive, but also sensitive to local values and cultures. Evaluations indicate best-in-class performance in all Nordic languages, without compromising performance in English. Viking is an enhanced model family of the previous model Poro, and is further evidence of the novel approach to train LLMs for low-resource languages, with an intent to train state-of-the-art LLMs for all official EU languages.

Following the completion of the language model Poro, Silo AI’s generative AI arm SiloGen and TurkuNLP of University of Turku are now releasing the first model checkpoints for Viking – a family of models for the Nordic languages. Viking relies on the same training approach as Poro, focusing on low-resource languages without compromising English, but extends to include Danish, English, Finnish, Norwegian, Icelandic, Swedish and programming languages. And the model family comes with an updated and more modern architecture and in a variety of model sizes.

This effort is part of Silo AI's broader strategy to empower linguistic diversity across the continent and further enhance LLM capabilities in low-resource languages. By utilizing the latest advancements in multilingual LLMs, Silo AI and TurkuNLP are working to create models that are not only linguistically performant and inclusive, but also sensitive to local values and cultures, ensuring that the technology serves as a bridge rather than a barrier in digital communication.

The completion of Poro and release of Viking function as the first steps in SiloGen’s efforts to train state-of-the-art LLMs for all official EU languages. This indicates that the initiative tailored for a wider European audience is on track and resulting in performant models. It builds Europe’s digital infrastructure, facilitating the widespread adoption of LLM-based products and applications and allowing for innovation in a broad range of sectors and use cases across Europe.

Viking checkpoint performance: 50% of training and 1000B tokens

After the first ten Viking checkpoints, covering 50% of training and 1000B tokens, we can observe preliminary evidence of outperformance with respect to other open models (e.g. Falcon, GPT-SW3, Llama, Mistral, MPT, StarCoder etc). Results indicate best-in-class performance in low-resource languages vis-à-vis other open models, without compromising performance in English and programming languages. In our latest evaluations, Viking is evaluated on a large number of relevant measures, including translated tests, MMLU, Arc-C, HellaSwag etc. While translated tests are commonly used (e.g. to prove multilinguality of Mistral Large) and provide indicative evidence, they don't fully capture the multilingual reasoning capabilities of language models. Another measure, perplexity, further corroborates Viking’s performance. Overall, Viking not only showcases its adeptness at understanding and generating Nordic languages but also highlights its efficiency in processing and predicting linguistic sequences. This dual advantage indicates the viability of the approach to train multilingual models, and Viking's technological edge in navigating the complexities of multilinguality.

Viking 7B, 13B and 33B: A modern architecture with more languages

Below is a summary of key features of the Viking model family covering English, Finnish, Swedish, Norwegian, Danish, Icelandic and code. For transparency with respect to model architecture, data and other technical information, please refer to the official model card (Viking 7B, Viking 13B, Viking 33B).

Research Checkpoints: While the first release includes five checkpoints for the models, further checkpoints are released throughout the training process, providing transparency on the model training process.
Model architecture: Viking uses an architecture similar to Llama 2, with flash attention, rotary embeddings, grouped query attention and supports a 4k sequence length
Model sizes: 7B, 13B and 33B parameters
Multilingual capabilities: The models are designed to process English and Nordic languages, and have proficiency with a variety of programming languages. Additionally, they can perform basic translation between English and Nordic languages.
Dataset: The model family is trained with a dataset of 2 trillion tokens, including Danish, English, Finnish, Icelandic, Norwegian, Swedish and a variety of programming languages represented.
Open source: The model family is freely available under the Apache 2.0 License, implying applicability for both commercial and research use.
Training hardware: Our models are trained using up to 1024 AMD MI250X GPUs on the LUMI supercomputer in Finland.

Considerations for Use

The intended audience for Poro Research Checkpoints is academic and industry research. These checkpoints are not suitable for deployment in a production use case without further training, fine-tuning and testing. For more on Silo AI's SaaS-based custom LLMs we invite you to familiarize yourself with the SiloGen platform.

Acknowledgments

We wish to thank the operators of the LUMI/EuroHPC supercomputer for computational resources and technical support, including AMD, HPE and CSC – the IT Center for Science, Finland. TurkuNLP researchers have received funding from the European Union’s Horizon Europe research and innovation programme High Performance Language Technologies (HPLT) under grant agreement No 101070350.

About

Silo AI

Silo AI is a leading AI lab on a joint mission with AMD to shape the future of AI computing. We’re a trusted AI partner that brings competitive advantage to leadership AI solutions. We build AI to enable smart devices, autonomous vehicles, industry 4.0, and smart cities. Silo AI trains state-of-the-art open source AI models, and offers customers unique access to world-class AI capabilities and the SiloGen platform. With advanced compute, a full-stack AI platform and world-leading AI scientists, our approach empowers organizations to develop AI that they own and control.

www.silo.ai

Want to discuss how Silo AI could help your organization?

Get in touch with our AI experts.

Peter Sarlin, PhD

Co-founder

peter.sarlin@silo.ai

Author

Authors

Share on Social

Subscribe to our newsletter

Join the 5000+ subscribers who read the Silo AI monthly newsletter to be among the first to hear about the latest insights, articles, podcast episodes, webinars, and more.

Country of residence

By submitting this form you agree to the processing of your personal data by Silo AI as described in the Privacy Policy.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Ready to level up your AI capabilities?

Succeeding in AI requires a commitment to long-term product development. Let’s start today.

Talk to an expert Join our team

Viking 7B/13B/33B: Sailing the Nordic seas of multilinguality

Viking checkpoint performance: 50% of training and 1000B tokens

Viking 7B, 13B and 33B: A modern architecture with more languages

Considerations for Use

Acknowledgments

About

Silo AI

Want to discuss how Silo AI could help your organization?

What to read next

AMD Silo AI and appliedAI expand their partnership with a program to accelerate AI adoption in life sciences, robotics and automotive

Europe's leading AI companies and research institutions combine their forces to develop next-generation open LLMs

AMD Silo AI and Combient unlock enterprise AI value on leadership compute platforms

Ready to level up your AI capabilities?