![]() |
VOOZH | about |
dotnet add package SymSpell --version 6.7.3
NuGet\Install-Package SymSpell -Version 6.7.3
<PackageReference Include="SymSpell" Version="6.7.3" />
<PackageVersion Include="SymSpell" Version="6.7.3" />Directory.Packages.props
<PackageReference Include="SymSpell" />Project file
paket add SymSpell --version 6.7.3
#r "nuget: SymSpell, 6.7.3"
#:package SymSpell@6.7.3
#addin nuget:?package=SymSpell&version=6.7.3Install as a Cake Addin
#tool nuget:?package=SymSpell&version=6.7.3Install as a Cake Tool
Spelling correction & Fuzzy search: 1 million times faster through Symmetric Delete spelling correction algorithm
The Symmetric Delete spelling correction algorithm reduces the complexity of edit candidate generation and dictionary lookup for a given Damerau-Levenshtein distance. It is six orders of magnitude faster (than the standard approach with deletes + transposes + replaces + inserts) and language independent.
Opposite to other algorithms only deletes are required, no transposes + replaces + inserts. Transposes + replaces + inserts of the input term are transformed into deletes of the dictionary term. Replaces and inserts are expensive and language dependent: e.g. Chinese has 70,000 Unicode Han characters!
The speed comes from the inexpensive delete-only edit candidate generation and the pre-calculation.<br> An average 5 letter word has about 3 million possible spelling errors within a maximum edit distance of 3,<br> but SymSpell needs to generate only 25 deletes to cover them all, both at pre-calculation and at lookup time. Magic!
If you like SymSpell, try SeekStorm - a sub-millisecond full-text search library & multi-tenancy server in Rust (Open Source).
<br>
Copyright (c) 2025 Wolf Garbe
Version: 6.7.3
Author: Wolf Garbe <wolf.garbe@seekstorm.com>
Maintainer: Wolf Garbe <wolf.garbe@seekstorm.com>
URL: https://github.com/wolfgarbe/symspell
Description: https://seekstorm.com/blog/1000x-spelling-correction/
MIT License
Copyright (c) 2025 Wolf Garbe
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
documentation files (the "Software"), to deal in the Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
https://opensource.org/licenses/MIT
Lookup provides a very fast spelling correction of single words.
0.033 milliseconds/word (edit distance 2) and 0.180 milliseconds/word (edit distance 3) (single core on 2012 Macbook Pro)<br>
<br><br> 1,870 times faster than BK-tree (see Benchmark 1: dictionary size=500,000, maximum edit distance=3, query terms with random edit distance = 0...maximum edit distance, verbose=0)<br><br> 1 million times faster than Norvig's algorithm (see Benchmark 2: dictionary size=29,157, maximum edit distance=3, query terms with fixed edit distance = maximum edit distance, verbose=0)<br>
1000x Faster Spelling Correction algorithm<br> Fast approximate string matching with large edit distances in Big Data<br> Very fast Data cleaning of product names, company names & street names<br> Sub-millisecond compound aware automatic spelling correction<br> SymSpell vs. BK-tree: 100x faster fuzzy string search & spell checking<br> Fast Word Segmentation for noisy text<br> The Pruning Radix Trie — a Radix trie on steroids<br>
LookupCompound supports compound aware automatic spelling correction of multi-word input strings.
1. Compound splitting & decompounding
Lookup() assumes every input string as single term. LookupCompound also supports compound splitting / decompounding with three cases:
Splitting errors, concatenation errors, substitution errors, transposition errors, deletion errors and insertion errors can by mixed within the same word.
2. Automatic spelling correction
Examples:
- whereis th elove hehad dated forImuch of thepast who couqdn'tread in sixthgrade and ins pired him
+ where is the love he had dated for much of the past who couldn't read in sixth grade and inspired him (9 edits)
- in te dhird qarter oflast jear he hadlearned ofca sekretplan
+ in the third quarter of last year he had learned of a secret plan (9 edits)
- the bigjest playrs in te strogsommer film slatew ith plety of funn
+ the biggest players in the strong summer film slate with plenty of fun (9 edits)
- Can yu readthis messa ge despite thehorible sppelingmsitakes
+ can you read this message despite the horrible spelling mistakes (9 edits)
0.2 milliseconds / word (edit distance 2) 5000 words / second (single core on 2012 Macbook Pro)
WordSegmentation divides a string into words by inserting missing spaces at appropriate positions.<br>
Examples:
- thequickbrownfoxjumpsoverthelazydog
+ the quick brown fox jumps over the lazy dog
- itwasabrightcolddayinaprilandtheclockswerestrikingthirteen
+ it was a bright cold day in april and the clocks were striking thirteen
- itwasthebestoftimesitwastheworstoftimesitwastheageofwisdomitwastheageoffoolishness
+ it was the best of times it was the worst of times it was the age of wisdom it was the age of foolishness
Applications:
Performance:
4 milliseconds for segmenting an 185 char string into 53 words (single core on 2012 Macbook Pro) <br>
single word + Enter: Display spelling suggestions<br> Enter without input: Terminate the program
multiple words + Enter: Display spelling suggestions<br> Enter without input: Terminate the program
string without spaces + Enter: Display word segmented text<br> Enter without input: Terminate the program
Demo, DemoCompound and SegmentationDemo projects can be built with the free Visual Studio Code, which runs on Windows, MacOS and Linux.
//create object
int initialCapacity = 82765;
int maxEditDistanceDictionary = 2; //maximum edit distance per dictionary precalculation
var symSpell = new SymSpell(initialCapacity, maxEditDistanceDictionary);
//load dictionary
string baseDirectory = AppDomain.CurrentDomain.BaseDirectory;
string dictionaryPath= baseDirectory + "../../../../SymSpell/frequency_dictionary_en_82_765.txt";
int termIndex = 0; //column of the term in the dictionary text file
int countIndex = 1; //column of the term frequency in the dictionary text file
if (!symSpell.LoadDictionary(dictionaryPath, termIndex, countIndex))
{
Console.WriteLine("File not found!");
//press any key to exit program
Console.ReadKey();
return;
}
//lookup suggestions for single-word input strings
string inputTerm="house";
int maxEditDistanceLookup = 1; //max edit distance per lookup (maxEditDistanceLookup<=maxEditDistanceDictionary)
var suggestionVerbosity = SymSpell.Verbosity.Closest; //Top, Closest, All
var suggestions = symSpell.Lookup(inputTerm, suggestionVerbosity, maxEditDistanceLookup);
//display suggestions, edit distance and term frequency
foreach (var suggestion in suggestions)
{
Console.WriteLine(suggestion.term +" "+ suggestion.distance.ToString() +" "+ suggestion.count.ToString("N0"));
}
//load bigram dictionary
string dictionaryPath= baseDirectory + "../../../../SymSpell/frequency_bigramdictionary_en_243_342.txt";
int termIndex = 0; //column of the term in the dictionary text file
int countIndex = 2; //column of the term frequency in the dictionary text file
if (!symSpell.LoadBigramDictionary(dictionaryPath, termIndex, countIndex))
{
Console.WriteLine("File not found!");
//press any key to exit program
Console.ReadKey();
return;
}
//lookup suggestions for multi-word input strings (supports compound splitting & merging)
inputTerm="whereis th elove hehad dated forImuch of thepast who couqdn'tread in sixtgrade and ins pired him";
maxEditDistanceLookup = 2; //max edit distance per lookup (per single word, not per whole input string)
suggestions = symSpell.LookupCompound(inputTerm, maxEditDistanceLookup);
//display suggestions, edit distance and term frequency
foreach (var suggestion in suggestions)
{
Console.WriteLine(suggestion.term +" "+ suggestion.distance.ToString() +" "+ suggestion.count.ToString("N0"));
}
//word segmentation and correction for multi-word input strings with/without spaces
inputTerm="thequickbrownfoxjumpsoverthelazydog";
maxEditDistance = 0;
suggestion = symSpell.WordSegmentation(input);
//display term and edit distance
Console.WriteLine(suggestion.correctedString + " " + suggestion.distanceSum.ToString("N0"));
//press any key to exit program
Console.ReadKey();
SymSpell targets .NET Standard v2.0 and can be used in:
The SymSpell, Demo, DemoCompound and Benchmark projects can be built with the free Visual Studio Code, which runs on Windows, MacOS and Linux.
Dictionary quality is paramount for correction quality. In order to achieve this two data sources were combined by intersection: Google Books Ngram data which provides representative word frequencies (but contains many entries with spelling errors) and SCOWL — Spell Checker Oriented Word Lists which ensures genuine English vocabulary (but contained no word frequencies required for ranking of suggestions within the same edit distance).
The frequency_dictionary_en_82_765.txt was created by intersecting the two lists mentioned below. By reciprocally filtering only those words which appear in both lists are used. Additional filters were applied and the resulting list truncated to ≈ 80,000 most frequent words.
You can build your own frequency dictionary for your language or your specialized technical domain. The SymSpell spelling correction algorithm supports languages with non-latin characters, e.g Cyrillic, Chinese or Georgian.
SymSpell includes an English frequency dictionary
Dictionaries for Chinese, English, French, German, Hebrew, Italian, Russian and Spanish are located here:<br>
Frequency dictionaries in many other languages can be found here:<br> FrequencyWords repository<br> Frequency dictionaries<br> Frequency dictionaries
C# (original source code)<br> https://github.com/wolfgarbe/symspell
.NET (NuGet package)<br> https://www.nuget.org/packages/symspell
The following third party ports or reimplementations to other programming languages have not been tested by myself whether they are an exact port, error free, provide identical results or are as fast as the original algorithm.
Most ports target SymSpell version 3.0. But version 6.1. provides much higher speed & lower memory consumption!
WebAssembly<br> https://github.com/justinwilaby/spellchecker-wasm<br>
WEB API (Docker)<br> https://github.com/LeonErath/SymSpellAPI (Version 6.3)<br>
C++<br> https://github.com/AtheS21/SymspellCPP (Version 6.5)<br> https://github.com/erhanbaris/SymSpellPlusPlus (Version 6.1)
Crystal<br> https://github.com/chenkovsky/aha/blob/master/src/aha/sym_spell.cr
Go<br> https://github.com/sajari/fuzzy<br> https://github.com/eskriett/spell
Haskell<br> https://github.com/cbeav/symspell
Java<br> https://github.com/MighTguY/customized-symspell (Version 6.6)<br> https://github.com/rxp90/jsymspell (Version 6.6)<br> https://github.com/Lundez/JavaSymSpell (Version 6.4)<br> https://github.com/rxp90/jsymspell<br> https://github.com/gpranav88/symspell<br> https://github.com/searchhub/preDict<br> https://github.com/jpsingarayar/SpellBlaze
Javascript<br> https://github.com/MathieuLoutre/node-symspell (Version 6.6, needs Node.js)<br> https://github.com/itslenny/SymSpell.js<br> https://github.com/dongyuwei/SymSpell<br> https://github.com/IceCreamYou/SymSpell<br> https://github.com/Yomguithereal/mnemonist/blob/master/symspell.js
Julia<br> https://github.com/Arkoniak/SymSpell.jl
Kotlin<br> https://github.com/Wavesonics/SymSpellKt
Objective-C<br> https://github.com/AmitBhavsarIphone/SymSpell (Version 6.3)
Python<br> https://github.com/mammothb/symspellpy (Version 6.7)<br> https://github.com/viig99/SymSpellCppPy (Version 6.5)<br> https://github.com/zoho-labs/symspell (Python bindings of Rust version)<br> https://github.com/ne3x7/pysymspell/ (Version 6.1)<br> https://github.com/Ayyuriss/SymSpell<br> https://github.com/ppgmg/github_public/blob/master/spell/symspell_python.py<br> https://github.com/rcourivaud/symspellcompound<br> https://github.com/Esukhia/sympound-python<br> https://www.kaggle.com/yk1598/symspell-spell-corrector
Ruby<br> https://github.com/PhilT/symspell
Rust<br> https://github.com/reneklacan/symspell (Version 6.6, compiles to WebAssembly)<br> https://github.com/luketpeterson/fuzzy_rocks (persistent datastore backed by RocksDB)
Scala<br> https://github.com/semkath/symspell
Swift<br> https://github.com/gdetari/SymSpellSwift
Contextual Multilingual Spellchecker for User Queries<br> Sanat Sharma, Josep Valls-Vargas, Tracy Holloway King, Francois Guerin, Chirag Arora (Adobe)<br> https://arxiv.org/abs/2305.01082
A context sensitive real-time Spell Checker with language adaptability<br> Prabhakar Gupta (Amazon)<br> https://arxiv.org/abs/1910.11242
SpeakGer: A meta-data enriched speech corpus of German state and federal parliaments<br> Kai-Robin Lange and Carsten Jentsch<br> https://arxiv.org/pdf/2410.17886
An Extended Sequence Tagging Vocabulary for Grammatical Error Correction<br> Stuart Mesham, Christopher Bryant, Marek Rei, Zheng Yuan<br> https://arxiv.org/abs/2302.05913
German Parliamentary Corpus (GERPARCOR)<br> Giuseppe Abrami, Mevlüt Bagci, Leon Hammerla, Alexander Mehler<br> https://arxiv.org/abs/2204.10422
iOCR: Informed Optical Character Recognition for Election Ballot Tallies<br> Kenneth U. Oyibo, Jean D. Louis, Juan E. Gilbert<br> https://arxiv.org/abs/2208.00865
Amazigh spell checker using Damerau-Levenshtein algorithm and N-gram<br> Youness Chaabi, Fadoua Ataa Allah<br> https://www.sciencedirect.com/science/article/pii/S1319157821001828
Survey of Query correction for Thai business-oriented information retrieval<br> Phongsathorn Kittiworapanya, Nuttapong Saelek, Anuruth Lertpiya, Tawunrat Chalothorn<br> https://ieeexplore.ieee.org/document/9376809
SymSpell and LSTM based Spell- Checkers for Tamil<br> Selvakumar MuruganTamil Arasan BakthavatchalamTamil Arasan BakthavatchalamMalaikannan Sankarasubbu<br> https://www.researchgate.net/publication/349924975_SymSpell_and_LSTM_based_Spell-_Checkers_for_Tamil
SymSpell4Burmese: Symmetric Delete Spelling Correction Algorithm (SymSpell) for Burmese Spelling Checking<br> Ei Phyu Phyu Mon; Ye Kyaw Thu; Than Than Yu; Aye Wai Oo<br> https://ieeexplore.ieee.org/document/9678171
Spell Check Indonesia menggunakan Norvig dan SymSpell<br> Yasir Abdur Rohman<br> https://medium.com/@yasirabd/spell-check-indonesia-menggunakan-norvig-dan-symspell-4fa583d62c24
Analisis Perbandingan Metode Burkhard Keller Tree dan SymSpell dalam Spell Correction Bahasa Indonesia<br> Muhammad Hafizh Ferdiansyah, I Kadek Dwi Nuryana<br> https://ejournal.unesa.ac.id/index.php/jinacs/article/download/50989/41739
Improving Document Retrieval with Spelling Correction for Weak and Fabricated Indonesian-Translated Hadith<br> Muhammad zaky ramadhanKemas M LhaksmanaKemas M Lhaksmana<br> https://www.researchgate.net/publication/342390145_Improving_Document_Retrieval_with_Spelling_Correction_for_Weak_and_Fabricated_Indonesian-Translated_Hadith
Symspell을 이용한 한글 맞춤법 교정<br> 김희규<br> https://heegyukim.medium.com/symspell%EC%9D%84-%EC%9D%B4%EC%9A%A9%ED%95%9C-%ED%95%9C%EA%B8%80-%EB%A7%9E%EC%B6%A4%EB%B2%95-%EA%B5%90%EC%A0%95-3def9ca00805
Mending Fractured Texts. A heuristic procedure for correcting OCR data<br> Jens Bjerring-Hansen, Ross Deans Kristensen-McLachla2, Philip Diderichsen and Dorte Haltrup Hansen<br> https://ceur-ws.org/Vol-3232/paper14.pdf
Towards the Natural Language Processing as Spelling Correction for Offline Handwritten Text Recognition Systems<br> Arthur Flor de Sousa Neto; Byron Leite Dantas Bezerra; and Alejandro Héctor Toselli<br> https://www.mdpi.com/2076-3417/10/21/7711
When to Use OCR Post-correction for Named Entity Recognition?<br> Vinh-Nam Huynh, Ahmed Hamdi, Antoine Doucet<br> https://hal.science/hal-03034484v1/
Automatic error Correction: Evaluating Performance of Spell Checker Tools<br>
ZHAW-CAI: Ensemble Method for Swiss German Speech to Standard German Text<br> Malgorzata Anna Ulasik, Manuela Hurlimann, Bogumila Dubel, Yves Kaufmann,<br> Silas Rudolf, Jan Deriu, Katsiaryna Mlynchyk, Hans-Peter Hutter, and Mark Cieliebak<br> https://ceur-ws.org/Vol-2957/sg_paper3.pdf
Cyrillic Word Error Program Based on Machine Learning<br> Battumur, K., Dulamragchaa, U., Enkhbat, S., Altanhuyag, L., & Tumurbaatar, P.<br> https://mongoliajol.info/index.php/JIMDT/article/view/2661
Fast Approximate String Search for Wikification<br> Szymon Olewniczak, Julian Szymanski<br> https://www.iccs-meeting.org/archive/iccs2021/papers/127440334.pdf
RuMedSpellchecker: Correcting Spelling Errors for Natural Russian Language in Electronic Health Records Using Machine Learning Techniques<br> Dmitrii Pogrebnoi, Anastasia Funkner, Sergey Kovalchuk<br> https://link.springer.com/chapter/10.1007/978-3-031-36024-4_16
An Extended Sequence Tagging Vocabulary for Grammatical Error Correction<br> Stuart Mesham, Christopher Bryant, Marek Rei, Zheng Yuan<br> https://aclanthology.org/2023.findings-eacl.119.pdf
Lightning-fast adaptive immune receptor similarity search by symmetric deletion lookup<br> Touchchai Chotisorayuth, Andreas Tiffeau-Mayer<br> https://arxiv.org/html/2403.09010v1
Unveiling Disguised Toxicity: A Novel Pre-processing Module for Enhanced Content Moderation<br> Johnny Chan, Yuming Li<br> https://www.sciencedirect.com/science/article/pii/S2215016124001225
netstandard2.0;net461;net47;netcoreapp3.0 to netstandard2.0;net9.0.bycycle → bicycle (instead of by cycle )<br>
inconvient → inconvenient (instead of i convent)<br>SymSpell is contributed by SeekStorm - the high performance Search as a Service & search API
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 net5.0 was computed. net5.0-windows net5.0-windows was computed. net6.0 net6.0 was computed. net6.0-android net6.0-android was computed. net6.0-ios net6.0-ios was computed. net6.0-maccatalyst net6.0-maccatalyst was computed. net6.0-macos net6.0-macos was computed. net6.0-tvos net6.0-tvos was computed. net6.0-windows net6.0-windows was computed. net7.0 net7.0 was computed. net7.0-android net7.0-android was computed. net7.0-ios net7.0-ios was computed. net7.0-maccatalyst net7.0-maccatalyst was computed. net7.0-macos net7.0-macos was computed. net7.0-tvos net7.0-tvos was computed. net7.0-windows net7.0-windows was computed. net8.0 net8.0 was computed. net8.0-android net8.0-android was computed. net8.0-browser net8.0-browser was computed. net8.0-ios net8.0-ios was computed. net8.0-maccatalyst net8.0-maccatalyst was computed. net8.0-macos net8.0-macos was computed. net8.0-tvos net8.0-tvos was computed. net8.0-windows net8.0-windows was computed. net9.0 net9.0 is compatible. net9.0-android net9.0-android was computed. net9.0-browser net9.0-browser was computed. net9.0-ios net9.0-ios was computed. net9.0-maccatalyst net9.0-maccatalyst was computed. net9.0-macos net9.0-macos was computed. net9.0-tvos net9.0-tvos was computed. net9.0-windows net9.0-windows was computed. net10.0 net10.0 was computed. net10.0-android net10.0-android was computed. net10.0-browser net10.0-browser was computed. net10.0-ios net10.0-ios was computed. net10.0-maccatalyst net10.0-maccatalyst was computed. net10.0-macos net10.0-macos was computed. net10.0-tvos net10.0-tvos was computed. net10.0-windows net10.0-windows was computed. |
| .NET Core | netcoreapp2.0 netcoreapp2.0 was computed. netcoreapp2.1 netcoreapp2.1 was computed. netcoreapp2.2 netcoreapp2.2 was computed. netcoreapp3.0 netcoreapp3.0 was computed. netcoreapp3.1 netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.0 netstandard2.0 is compatible. netstandard2.1 netstandard2.1 was computed. |
| .NET Framework | net461 net461 was computed. net462 net462 was computed. net463 net463 was computed. net47 net47 was computed. net471 net471 was computed. net472 net472 was computed. net48 net48 was computed. net481 net481 was computed. |
| MonoAndroid | monoandroid monoandroid was computed. |
| MonoMac | monomac monomac was computed. |
| MonoTouch | monotouch monotouch was computed. |
| Tizen | tizen40 tizen40 was computed. tizen60 tizen60 was computed. |
| Xamarin.iOS | xamarinios xamarinios was computed. |
| Xamarin.Mac | xamarinmac xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos xamarinwatchos was computed. |
Showing the top 3 NuGet packages that depend on SymSpell:
| Package | Downloads |
|---|---|
|
Biomatch.Domain
Utilities to match person records via demographic data |
|
|
Holbor.Speller
The symmetric spelling correction algorithm in .NET Core. |
|
|
VirtoCommerce.IntentSearch.Data
Package Description |
Showing the top 4 popular GitHub repositories that depend on SymSpell:
| Repository | Stars |
|---|---|
|
overtools/OWLib
DataTool is a program that lets you extract models, maps, and files from Overwatch.
|
|
|
harrynull/WoxDictionary
A Wox dictionary plugin that supports spelling correction and synonym.
|
|
|
spacechase0/StardewValleyMods
New home for my stardew valley mod source code
|
|
|
dr-marek-jaskula/DomainDrivenDesignUniversity
This project was made for tutorial purpose - to clearly present the domain driven design concept.
|
Exception fixed in WordSegmentation