Submitted by Stefan Schweter 8 The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models CORAL NLP Research 6 2