Encoding DetectorDetect text file character encoding (UTF-8, UTF-16, ASCII, Latin-1).

Encoding Detector
Detect text file character encoding (UTF-8, UTF-16, ASCII, Latin-1).
Upload text file
Drop or select a text file to analyze.
View encoding result
See detected encoding, BOM status, and confidence level.
Preview content
View a preview of the decoded text content.
What Is Encoding Detector?
Encoding Detector analyzes text files to determine their character encoding. It checks for Byte Order Marks (BOM) for definitive encoding identification, then uses heuristic analysis for files without BOM. The tool detects UTF-8, UTF-16 (LE/BE), UTF-32 (LE/BE), ASCII, and ISO-8859-1/Windows-1252 encodings. Results include the detected encoding, confidence level, BOM details, analysis explanation, and a decoded content preview.
Why Use Our Encoding Detector?
- Detects encoding via BOM and heuristic byte analysis.
- Supports UTF-8, UTF-16, UTF-32, ASCII, and Latin-1/Windows-1252.
- Shows confidence level and detection method details.
- Includes decoded content preview to verify detection accuracy.
Common Use Cases
Character Issues
Diagnose mojibake and character display issues by identifying the correct file encoding.
Data Import
Determine file encoding before importing text data to ensure correct character handling.
Legacy Files
Identify encoding of legacy text files that may use non-UTF-8 encodings.
Development
Verify encoding of source code files, CSV data, and configuration files.
Technical Guide
The detector uses a multi-stage approach: 1. BOM Detection: Checks the first 4 bytes for known BOM sequences (UTF-8: EF BB BF, UTF-16 LE: FF FE, UTF-16 BE: FE FF, UTF-32 LE: FF FE 00 00, UTF-32 BE: 00 00 FE FF). BOM presence provides high-confidence detection. 2. UTF-16 Heuristic: Analyzes null byte patterns. UTF-16 files have frequent null bytes in even or odd positions corresponding to ASCII characters encoded in 16-bit. 3. UTF-8 Validation: Validates multi-byte sequences. Valid UTF-8 has specific patterns: 110xxxxx 10xxxxxx for 2-byte, 1110xxxx 10xxxxxx 10xxxxxx for 3-byte, etc. 4. ASCII Detection: If all bytes are in the 0x00-0x7F range, the file is pure ASCII (which is also valid UTF-8). 5. Latin-1 Fallback: If bytes exist in the 0x80-0xFF range but don't form valid UTF-8 sequences, ISO-8859-1/Windows-1252 is likely. Only the first 8KB of the file is analyzed for performance.
Tips & Best Practices
- 1BOM detection provides the highest confidence — files with BOM are definitively identified.
- 2UTF-8 without BOM is detected by validating multi-byte sequences.
- 3ISO-8859-1 and Windows-1252 are detected as a fallback when UTF-8 validation fails.
- 4The content preview helps verify the detection is correct — look for garbled characters.
Related Tools

BOM Detector/Remover
Detect and remove Byte Order Marks (BOM) from text files.

Line Ending Detector
Detect and count line ending types (CRLF, LF, CR) in text files.

File Format Identifier
Detect file format by analyzing magic bytes (file signature) in the header.

File Metadata Viewer
View comprehensive file metadata including size, type, entropy, and hex header.

CSV to JSON
Convert CSV data to JSON array format instantly in your browser.

JSON to CSV
Convert JSON arrays to CSV format with proper escaping and formatting.
Frequently Asked Questions
QHow accurate is the detection?
QWhat is a BOM?
QCan it detect Shift-JIS or GB2312?
QHow much of the file is analyzed?
QWhat about mixed encoding files?
About Encoding Detector
Encoding Detector is a free online tool from FreeToolkit.ai. All processing happens directly in your browser — your data never leaves your device. No registration required. No ads. Just fast, reliable tools.







