Completely new OTB release on July 1st

Völlig neue OTB-Veröffentlichung am 1. Juli

On July 1st, I'll be releasing a new OTB database and its accompanying PGN files. This version is almost completely free of duplicate games. You can learn more about this process in the linked article.

Am 1. Juli veröffentliche ich eine neue OTB-Datenbank und die dazugehörigen PGN-Dateien. Diese Version ist nahezu vollständig von doppelten Partien bereinigt. Im verlinkten Artikel kannst Du nachlesen mehr über diesen Prozess erfahren.

Free chess game database for Scid

Current version:

Databases: 2025-06-03 (includes the following incremental update)

Opening Books Time Ranges: 2025-02-06
Opening Books ECO: 2025-02-06
Opening Books NIC-Codes: 2025-07-13

Incremental update: none until now

How can you support me?

I love coffee! You are welcome to buy me a coffee!

The initial creation of the database, the cleansing, research for sources etc. was quite time-consuming. However, this has now been completed, so further maintenance is no longer a major problem.

But of course I also pay for this website – so if anyone likes the database and wants to help keep this site going, please feel free to support me on Buy Me A Coffee.

But now let’s get to the important things ;)

About the database

This database was initially created with and for Scid vs. PC/MAC and Scid 5.0 and has now been split into two parts.

  1. OTB database with about 9.5 million games
  2. Online dtabase with about 7.1 million games

Since all Scid versions unfortunately seem to overlook a lot of duplicate games, I have written a script that deduplicates the games of a larger PGN file in several phases. Here you can take a closer look at how the script works. Since the process for 10 million games takes a good 10 hours, I will only run this deduplication via the OTB database and not for every release. I think at the beginning of each year is sufficient at this point.

The download is possible in PGN as well as in the respective database format of Scid vs. PC/MAC and Scid 5.0.

The data comes from various sources, including several existing databases and PGN files. In order to be able to trace the origin of the individual games, all entries were provided with a SOURCE tag. These tags make it possible to filter and search the database with Scid vs. PC/MAC or Scid 5.0 according to specific sources.

No.TagDescriptionDatabase
1AjedrezCorrCorrespondence chess database of https://ajedrezdata.com/databases/OTB
2AjedrezOTBOTB database of https://ajedrezdata.com/databases/both
3BritbaseFor all games from the British Chess Game Archiveboth
4CaissabaseDatabase, that seems not to be existant anymore.both
5CanadaCanadian chess federationboth
6ChessNostalgia (*)Nothing more to be found on this page.OTB
7ChessOK (*)chessok.com still offers PGN’s free of charge. (Until the end of 2020)both
8Chessopolis (*)PGNs may still be offered, but then behind a paywall.OTB
9ChessScotlandScottish chess federationOTB
10Convekta (*)A Russian software developer that sells software like Chessbase. Mainly used in Russia.OTB
11DanbaseDatabase of the danish chess federation https://danbase.skak.dk/
Thanks @Hans Jørgen Lassen
OTB
12FederscacchiItalian chess federationOTB
13FICGSFree Internet Chess & Go Server (Correspondence games)OTB
14FinlandFinnish chess federationboth
15GamesOfGMsA database containing only OTB games by grandmastersOTB
16GreekBaseGreek chess federationboth
17IECGInternational Email Chess GroupOTB
18KingbaseA database containing only OTB games, mostly grandmastersOTB
19LichessBroadcastFor the lots that are drawn through the Lichess transmission systemboth
20LichessEliteDatabaseAll (standard) games of lichess to keep only games of players with a rating of 2400+ against players with a rating of 2200+, excluding bullet games. Source: https://database.nikonoel.fr/
All classical, rapid as well as blitz games in which both players are over 2550 Liches ELO, are added.
Online
21LumbrasGigaBaseAll games from existing databases, where the origin cannot be clarified, have been given this tag.both
22MastersA database containing games from title holdersboth
23MillionbaseAnother old databaseboth
24PGNMentorExtensive archive with individual files for players, openings, opening variations and various tournaments.both
25SlovakiaSlovakian chess federationOTB
26TWICFor all games from the TWIC download.both
(*) Found in the Github project

You can search for the SOURCE tag in Scid. Menu Search –> General –> Extra tags:

The data preparation process

After merging the databases, a number of measures were taken to compress the database an eliminate duplicates:

  • All games with less than 10 half-moves have been deleted.
  • All player names, tournament locations, rounds etc. were corrected using Scid’s maintenance function, as far as Scid was able to do so.
  • All games in which both players have an ELO rating lower than 1800 ELO have been deleted.
  • ECO codes have been added to all games.
  • All remaining games were checked for duplicates. The following parameters had to match in order to declare the game a double:
    • Matching first 4 letter of the player names
    • The same player colors.
    • The same moves.
    • The same result.
    • The files were processed with the program pgn-extract. Unnecessary tags were removed and some were renamed to have the information available in standard tags (primarily date)
  • The files were processed with the pgn-extract program. Unnecessary tags were removed and some were renamed so that the information is available in standard tags (mainly date).

After this process in Scid, the PGN files are searched by a script I wrote for duplicates not found by Scid and these are then removed.

Separation of the databases

In order to separate the contents of the databases into online and offline games, the following search terms were used (both for the tournament and the location):

  • chess.com
  • lichess.org
  • chess24.com
  • online
  • internet
  • titled arena
  • titled tue

If you have any other suggestions for search terms, please let me know by email or as a comment.

Content of the database

An example of the players contained in the database can be found here.

  • OTB game database with around 9.7 million games
  • Online game database with around 7.1 million games
  • More than 700.000 Player (OTB and online)

Copyright

A copyright on chess games without any annotations cannot be agreed under German law, but the annotation itself can. Therefore, the entire database was cleared of all annaotations and variations. The German Chess Federation clarified this question in a short article in 2006. The expert opinion mentioned in the article is available online and can be downloaded:

DSB expert opinion on the question “Is there a copyright on chess games” (PDF in german language)

Future updates

The database is usually updated on the first Tuesday of each month. The database files in si4 and si5 format as well as an update file in PGN format, which contains the new games since the last update, are made available. If you have already downloaded a database version, it is usually sufficient to import the new games in PGN format into the database. This saves you having to download the complete chess database again.

28 responses to “Free chess game database for Scid”

  1. CN

    Hi, there used to be a monthly database updates in PGN format, but I couldn’t find it anymore. Could you point me to where I can download it? Thanks.

    1. I recently released a new version of the database with a lot of new sources. So I decided to remove all cumulative files and export all PGN’s once again as they were a lot of changes. The next cumulative file will be most likely released in August, as I’m currently working on an sophisticated deduplication script, to remove as many duplicates as possible. Sadly the Scid versions aren’t doing a very good job in finding duplicates.

      The next full release is planned, as usual, for the 1st of July and will be – again – a full release, due to the removed duplicate games.

  2. Aishik Chattopadhyay

    Why the link is not opening?

    1. Which one do you mean? I have several links on the website. Please be more specific.

  3. Congratulations on your great work!
    I have noticed that the elo cleaning under 1800 hasn’t been applied to the greekbase db

    1. Thanks for the hint, I will check ;)

      Sadly, it seems, that this has happend to all the new added sources. I will remove those games, as soon as I managed to remove another set of duplicates.

      Right now, if you want to remove them by yourself try to search for ELO ranges form 1-1799 for each player. To set the lower value to 0 ist not recommended, as there are a lot of (historical) games, that would be removed, as there is no ELO value set.

  4. Richard Thrasher

    I am using your database of chess games to construct a database of chess positions. I have written the code to read your files and play the games therein. I still have code to write to sort the positions and save them into files. The code I have written is still being debugged. Nevertheless, I have found a game that contains an impossible move. I am working with the 2025-01-07 version of your files, with no updates. In the file of undated games, game three (Radeva vs Wagner) starts with the moves

    1.d4 Nf6 2.Nf3 e6 3.g3 b5 4.Bg2 Bb7 5.a4 b4 6.c4 bxc3 7.Nxc3 Bb4 8.O-O a5
    9.Bg5 d6 10.Qc2 Nbd7 11.Nb5 Rc8 12.Na7 Rb8 13.Nc6 Bxc6 14.Qxc6 Rb6 15.Qc2

    The queen move Qxc6 at move 14 would have the Q at c2 (from move 10) capture the B at c6 (from move 13). But there is a white pawn at c4 (from move 6) that is blocking this move.

    I fully expect that with such a large database there will be some mistaken moves, though I am surprised to have found a error in the third game. I hope soon to have played all the games. I will certainly let you know what I find.

    1. Hi Richard,

      I looked at the game and your moves and found no mistakes. The pawn you mentioned was taken en passant (6.c4 bxc3). I would guess that your code is not quite correct in this case.

      Regards,
      Michael

      1. Richard Thrasher

        Yes, Thankyou.

        1. Richard Thrasher

          I want to say how much I appreciate your help with debugging my code. I did indeed have a problem in my EP detection routine. Now I have come upon another problem, an ambiguous move. In game 7 in the same file, at move 7 Black plays Ne7 (line 143). But both black knights can reach e7. The variant Nce7 runs into trouble when the white bishop tries to capture the black night at c6 in move 15, while the variant Nge7 successfully runs to conclusion. I suspect ambiguous moves might be quite common. For now I will just keep a list of those I find, but it might be worthwhile to try to resolve the ambiguity by running all the possibilities and seeing which actually work.

          1. Richard Thrasher

            I have once again missed the point, there is no ambiguity because the c N is pinned on the king. I will have to add checks and pins to my code as well. LOL

  5. Kalonji Collins

    Excellent job!!
    I combed through the ‘no date’ database, and found dates for the games. For the London Blitz games in that file, I had to match by name. So I know that every player played in that tournament, but I didn’t have time to confirm all the games.

    Webmaster, send me an email if you want me to share this with you.

  6. Chessmaster2780

    Best database ever. It would be really nice to release a lite version with only the important games above 2200 elo.

    1. Many thanks for the praise. Feel free to email me with some suggestions of what you’d like to see in the lite version. I can’t promise that I’ll make the effort to implement this, as it’s just a “side project” of mine!

      Regards, Michael

  7. Woprandi

    I’m a EnCroissant (http://www.encroissant.org) contributor and just discovered your amazing work. I could regularly convert the updated database to EC format to be available to download.

    1. Hi William,

      You are welcome to convert the database, but please also refer to my website, as described in the license ;)

      Regards, Michael

    2. Ryan

      Is the Lumbra Gigabase now included with En Croissant?

      1. I don’t have an idea, if the developers have included my database until now. But if they haven’t the likely should wait until next release, as there are most likely coming some structural changes.

        As of right now, I plan to split the database in two, one containing OTB games, the other one containing Blitz and rapid, mostly online.

        This is due to space problems with the SI4 database as this type of DB can “only” contain a maximum of 16.777.216 games..

  8. Hans Jørgen Lassen

    You can find another 100,000 games here: https://danbase.skak.dk/

    1. Hi Hans,

      I will take a look at it. Thanks ;)

      Regards, Michael

    2. Most of the games were already in the database. However, I was able to add almost 30.000 games.

      Regards, Michael

  9. Tuttobenny

    Hi, do you have a strategy to clear up the “Events” field? Problem: Scid does have a limit on # of events, but in the Lichess database, every match is a unique event (the games URL). So, import of the really big pgns of the Lichess database is not possible even in SCID5.

    Solution: Regex search/replace. But how? I tried many tools, but it all fails. E.g., last I tried was fnr – it succeeds until Lichess 2014 files, but then fails on my machine, memory out of range exception, something like this. Not easy!

    Do you have an idea? It’s Windows here.

    What I do is create polyglot books with weak players.

    1. I’ve sent you an email. You might want to take a look ;)
      Regards, Michael

  10. Thank you so much for all your hard work. I have been using SCID for many years, but now have a great database to go with it. I’m going to share this with all my students. Now, to work out how to buy you a coffee :-)

    1. Thank you very much for your kind words and your support!
      Regards, Michael

  11. Gaiil

    If this page is an chess move. It will be a brilliant move.

    As my favorite Gm would say.
    Sanx for ze games. – Daniil Dubov

  12. This is a fantastic resource! Thank you so much for making this collection.
    I’m in the process of writing a book on romantic style chess openings, and your database has been very helpful in finding games from some of the old masters.

    1. Thank you very much for the praise. I am pleased that the database has helped you with your project.
      Regards, Michael

Leave a Reply

Your email address will not be published. Required fields are marked *

Views: 9710

Scroll to Top