Data Sources

This page credits third-party data sources and models used by Xfire word tools.

These credits do not make Xfire a tournament word-list source. The word-validity corpora used by our word tools are independent general-English and game-helper corpora, not NWL, NASPA, TWL, or other Scrabble tournament dictionaries.

gwordlist / Google Books frequency data

Xfire uses gwordlist, a Google Books Ngram-derived frequency list, as a ranking overlay for word-reference tools such as the Anagram Solver, Word Finder, Playable Word Checker, and Crossword Pattern Matcher.

The source data is credited under the Creative Commons Attribution 3.0 Unported license.

Xfire uses, modifies, and transforms that source data into lowercase word-score data for ranking results in its existing word corpora. It is used for ordering and scoring suggestions, not as a validity dictionary or tournament word list.

Mini-BART G2P and LibriSpeech Alignments

Xfire uses Mini-BART G2P to generate fallback ARPABET pronunciations for Rhyme Finder words that are not covered by CMUdict.

Mini-BART G2P is released under the Apache 2.0 license. Its model card reports training on LibriSpeech Alignments, credited under the Creative Commons Attribution 4.0 International license, along with CMUdict.

Xfire uses, modifies, and transforms those generated pronunciations into rhyme groups, and labels generated-pronunciation matches separately from dictionary-confirmed CMUdict matches.