Monthly update 2024-11-01 is online!

Free, large game collection in a database for Scid

Why was a new database created?

Some time ago, I searched for freely available PGN files or databases that are still being maintained, but was unsuccessful. Either such projects could no longer be found or they had not been updated for years.

So I decided to create my own database with Scid vs. PC/MAC. I started on the basis of several existing databases and PGN files. I tagged each source with a SOURCE tag, so you will be able to search for them, using Scid vs. PC/MAC or Scid 5.0.

TagDescription
BritbaseFor all games from the British Chess Game Archive
ChessNostalgia (*)Nothing more to be found on this page.
ChessOK (*)chessok.com still offers PGN’s free of charge. (Until the end of 2020)
Chessopolis (*)PGNs may still be offered, but then behind a paywall.
Convekta (*)Possibly a publisher who offered chess literature. Only dealers with Convekta products to be found
DanbaseDatabase of the danish chess federation https://danbase.skak.dk/
Thanks @ Hans Jørgen Lassen
LichessBroadcastFor the lots that are drawn through the Lichess transmission system
LichessEliteDatabaseAll (standard) games of lichess to keep only games of players with a rating of 2400+ against players with a rating of 2200+, excluding bullet games. Source: https://database.nikonoel.fr/
All classical, rapid as well as blitz games in which both players are over 2550 Liches ELO, are added.
LumbrasGigaBaseAll games from existing databases, where the origin cannot be clarified, have been given this tag.
PGNMentorExtensive archive with individual files for players, openings, opening variations and various tournaments.
TWICFor all games from the TWIC download.
(*) Found in the Github project

You can search for the SOURCE tag in Scid. Menu Search –> General –> Extra tags:

The data preparation process

After merging the databases, a number of measures were taken to compress the database an eliminate duplicates:

  • All games with less than 10 half-moves have been deleted.
  • All player names, tournament locations, rounds etc. were corrected using Scid’s maintenance function, as far as Scid was able to do so.
  • All games in which both players have an ELO rating lower than 1800 ELO have been deleted.
  • ECO codes have been added to all games.
  • All remaining games were checked for duplicates. The following parameters had to match in order to declare the game a double:
    • Matching first 4 letter of the player names
    • The same player colors.
    • The same moves.
    • The same result.
    • The files were processed with the program pgn-extract. Unnecessary tags were removed and some were renamed to have the information available in standard tags (primarily date)

Content of the database

An example of the players contained in the database can be found here.

  • More than 13.200.000 Games
  • More than 590.000 Player
  • More than 36.000 Events
  • More than 26.000 Locations

A copyright on chess games without any annotations cannot be agreed under German law, but the annotation itself can. Therefore, the entire database was cleared of all annaotations and variations. The German Chess Federation clarified this question in a short article in 2006. The expert opinion mentioned in the article is available online and can be downloaded:

DSB expert opinion on the question “Is there a copyright on chess games” (PDF in german language)

Future updates

As the weekly updates have not really been very popular so far, the database will now only be updated once a month – usually on the first Tuesday:

  • Database files (si5- and si4 format)
  • A differential PGN-file containing the new games since release of the last database.

How can you support me?

I love coffee! You are welcome to buy me a coffee!

The initial creation of the database, the cleansing, research for sources etc. was quite time-consuming. However, this has now been completed, so further maintenance is no longer a major problem.

But of course I also pay for this website – so if anyone likes the database and wants to help keep this site going, please feel free to support me on Buy Me A Coffee.

10 responses to “Free, large game collection in a database for Scid”

  1. Hans Jørgen Lassen

    You can find another 100,000 games here: https://danbase.skak.dk/

    1. Hi Hans,

      I will take a look at it. Thanks ;)

      Regards, Michael

    2. Most of the games were already in the database. However, I was able to add almost 30.000 games.

      Regards, Michael

  2. Tuttobenny

    Hi, do you have a strategy to clear up the “Events” field? Problem: Scid does have a limit on # of events, but in the Lichess database, every match is a unique event (the games URL). So, import of the really big pgns of the Lichess database is not possible even in SCID5.

    Solution: Regex search/replace. But how? I tried many tools, but it all fails. E.g., last I tried was fnr – it succeeds until Lichess 2014 files, but then fails on my machine, memory out of range exception, something like this. Not easy!

    Do you have an idea? It’s Windows here.

    What I do is create polyglot books with weak players.

    1. I’ve sent you an email. You might want to take a look ;)
      Rgeards, Michael

  3. Thank you so much for all your hard work. I have been using SCID for many years, but now have a great database to go with it. I’m going to share this with all my students. Now, to work out how to buy you a coffee :-)

    1. Thank you very much for your kind words and your support!
      Regards, Michael

  4. Gaiil

    If this page is an chess move. It will be a brilliant move.

    As my favorite Gm would say.
    Sanx for ze games. – Daniil Dubov

  5. This is a fantastic resource! Thank you so much for making this collection.
    I’m in the process of writing a book on romantic style chess openings, and your database has been very helpful in finding games from some of the old masters.

    1. Thank you very much for the praise. I am pleased that the database has helped you with your project.
      Regards, Michael

Leave a Reply

Your email address will not be published. Required fields are marked *

Views: 3911

Scroll to Top