textscrubr
Home / Blog / Hidden characters

What Is the Byte Order Mark (BOM) and How to Remove It

Hidden characters2 min readUpdated 2026-06-23
The byte order mark (BOM) is an invisible character (U+FEFF) at the start of a file that signals its encoding. It causes broken JSON, stray characters at the top of output, and failed comparisons. To remove it, save the file as UTF-8 without BOM, or run the text through a cleaner that strips U+FEFF.

The byte order mark is a tiny invisible character that causes outsized trouble, especially in code and data files. Here is what it is and how to get rid of it.

What the BOM is

The byte order mark, Unicode U+FEFF, is an invisible character that can appear at the very start of a text file. Originally it signaled the byte order and encoding of the file. In UTF-8 it is unnecessary, but many editors and tools add it anyway, where it sits silently at position zero.

You cannot see it, but it is there, before your first visible character.

Why the BOM causes problems

How to remove the BOM

Save as UTF-8 without BOM. Most editors let you choose the encoding when saving:

Strip it from the text directly. If you are working with copied text rather than a file, paste it into a cleaner that removes U+FEFF. textscrubr strips the BOM along with other invisible characters and shows you that it found one, so you know it is gone.

How to check for a BOM

Prevent it from coming back

Set your editor's default encoding to UTF-8 without BOM. For files generated by other tools, add a strip-BOM step to your pipeline so it never reaches code or data that chokes on it.

Scrub this text in one click

textscrubr strips the hidden characters, em dashes, and double spaces, and keeps your lists, headings, and code exactly where you put them. Free, and it runs entirely in your browser.

Clean my text free →

Frequently asked questions

What is a byte order mark (BOM)?

An invisible character (U+FEFF) that can appear at the start of a text file to signal its encoding. In UTF-8 it is unnecessary but often added, where it causes hidden problems.

Why does a BOM break my JSON?

Many JSON parsers expect the first character to be the opening brace. A BOM sits before it, so the parser fails with a confusing error even though the JSON looks correct.

How do I remove a BOM?

Save the file as UTF-8 without BOM in your editor, or for copied text, run it through a cleaner that strips the U+FEFF character along with other invisible characters.