You open a CSV that was perfectly fine yesterday. Now names look like JosÃ©, prices show as â‚¬19.99, and curly quotes have turned into â€œthisâ€. Nothing changed about your data. The file is broken.

It isn't broken. The data is still there, intact — but the file was saved in one character encoding and opened expecting a different one. Every character became the wrong character.

This guide explains what's happening, how to detect it, and how to fix it across every tool you might actually use.

What Character Encoding Errors Look Like

The classic symptoms:

What you see	What it means
`JosÃ©` instead of `José`	File is Windows-1252 or Latin-1, opened as UTF-8
`â€œhelloâ€` instead of `"hello"`	Smart quotes encoded in UTF-8, decoded as Windows-1252
`â‚¬` instead of `€`	Euro sign mis-decoded
`?` or `â–ˆ` boxes everywhere	File is Shift-JIS or GB2312, opened with a Latin encoding
Accent-free text (`cafe` instead of `café`)	High-byte characters were stripped during export

The corrupted-text phenomenon has a name: mojibake (文字化け), Japanese for "character transformation." It's always a mismatch between the encoding the software wrote with and the encoding the software reads with.

The underlying data is fine. You don't need to re-export from the source — you just need to tell your tool what encoding to use when reading.

Why This Happens: A 90-Second Encoding Primer

Every character in a text file is stored as a number. The encoding is the lookup table that maps numbers to characters.

ASCII (1963): 128 characters. Works for English only.
Windows-1252 (1985): 256 characters. Extends ASCII with European accented letters. Widely used in Windows software.
ISO-8859-1 / Latin-1 (1987): Similar to Windows-1252, slightly different for characters 128–159.
UTF-8 (1993): Handles every character in every human language. Backwards-compatible with ASCII. The modern standard.

The problem: é (e with an accent) is stored as 0xE9 in Windows-1252. In UTF-8, 0xE9 is an incomplete two-byte sequence — so UTF-8 decoders fill in a replacement character instead, producing Ã©.

No data was lost. The bytes are still 0xE9. You just need to tell your software "read this file as Windows-1252," not UTF-8.

A diagram showing two labeled containers side by side — one marked with a crossed-out encoding label and one marked with the correct encoding label — connected by arrows pointing to an identical set of clean data blocks, representing encoding mismatch detection and correction

Step 1: Identify the Actual Encoding

Before you can fix the problem, you need to know what encoding the file is actually using.

On Mac: Use the `file` command

file -I yourfile.csv

Typical output:

yourfile.csv: text/plain; charset=utf-8
yourfile.csv: text/plain; charset=iso-8859-1
yourfile.csv: text/plain; charset=unknown-8bit

unknown-8bit means the file has bytes above 127 that the tool can't definitively classify. Try Windows-1252 first — it's by far the most common culprit for European exports and legacy CRM dumps.

Open the raw bytes in a text editor

On Mac: open the file in TextEdit, BBEdit, or VS Code. In VS Code, the detected encoding appears in the bottom status bar (e.g., UTF-8). Click it to try reinterpreting the file with a different encoding.

Check where the file came from

The source often tells you which encoding to expect:

Source	Likely encoding
Windows Excel (pre-2019)	Windows-1252
Windows Excel 2019+ (Save as CSV UTF-8)	UTF-8 with BOM
Mac Excel	UTF-8
Google Sheets export	UTF-8 without BOM
SAP / legacy ERP	ISO-8859-1 or Windows-1252
Japanese systems	Shift-JIS
Chinese systems	GB2312 / GBK
Modern APIs and databases	UTF-8

How to Fix Encoding Errors in Google Sheets

Google Sheets always exports UTF-8, but it doesn't always import your file correctly — especially if the file came from Windows.

Method 1: Use the Import Wizard

Don't drag the file into Google Sheets and don't open it from Drive without specifying the encoding. Use File → Import instead:

Open Google Sheets and go to File → Import
Upload the CSV
In the import settings, find Character encoding and change it from Automatic to the encoding your file actually uses (e.g., Windows-1252 or ISO-8859-1)
Confirm and import

This is the most reliable fix. Google Sheets will now decode the bytes correctly and all accented characters will appear as intended.

Method 2: Convert the File to UTF-8 First

If you're regularly receiving Windows-1252 files and want Google Sheets to just work without touching the import dialog, convert the file to UTF-8 before uploading.

The quickest way on Mac:

iconv -f windows-1252 -t utf-8 input.csv > output_utf8.csv

Replace windows-1252 with iso-8859-1, shift-jis, or whatever encoding file -I reported.

Now upload output_utf8.csv to Google Sheets — it will import cleanly with no encoding dialog needed.

How to Fix Encoding Errors in Excel (Mac)

Excel on Mac handles encoding better than its Windows counterpart, but still requires explicit guidance for non-UTF-8 files.

Use the Text Import Wizard

Open a blank workbook
Go to Data → Get Data → From Text/CSV
Select your file
In the preview dialog, find the File Origin dropdown and select the correct encoding (e.g., 1252: Western European (Windows))
Click Load

The "File Origin" dropdown is Excel's way of letting you specify encoding. The numbering corresponds to Windows code pages — 1252 is Windows-1252, 65001 is UTF-8.

The UTF-8 BOM Problem (Excel-specific)

Windows Excel has a quirk: when you save a file as "CSV UTF-8 (Comma delimited) (*.csv)", it prepends a BOM (Byte Order Mark) — three invisible bytes at the start of the file (0xEF 0xBB 0xBF).

A BOM isn't harmful when the file stays in Excel, but it confuses many other tools: Python's csv module reads the first column as ï»¿Column1 instead of Column1. Google Sheets handles it fine. Most Unix tools don't.

To strip a BOM from a file on Mac:

# Check for BOM
hexdump -C yourfile.csv | head -1

# Strip BOM (outputs to new file)
tail -c +4 yourfile.csv > yourfile-nobom.csv

Or with sed:

sed -i '' 's/^\xEF\xBB\xBF//' yourfile.csv

How to Fix Encoding Errors With Python

Python gives you the most control and is the right tool for batch processing or when you need to handle multiple files.

Detect encoding with chardet

pip install chardet

import chardet

with open('yourfile.csv', 'rb') as f:
    raw = f.read(10000)  # read first 10KB — enough to detect
    result = chardet.detect(raw)
    print(result)
    # {'encoding': 'ISO-8859-1', 'confidence': 0.73, 'language': ''}

The confidence value matters: below 0.7, try a few encodings manually.

Re-read and convert with pandas

import pandas as pd

# Read with the detected encoding
df = pd.read_csv('yourfile.csv', encoding='windows-1252')

# Save as clean UTF-8 (no BOM)
df.to_csv('output_utf8.csv', index=False, encoding='utf-8')

# Save as UTF-8 with BOM (if destination is Windows Excel)
df.to_csv('output_utf8_bom.csv', index=False, encoding='utf-8-sig')

utf-8-sig is pandas' name for UTF-8 with BOM. Use it only when the output will be opened in Windows Excel.

Batch convert a directory

import pandas as pd
from pathlib import Path

source_encoding = 'windows-1252'
input_dir = Path('raw_exports/')
output_dir = Path('converted/')
output_dir.mkdir(exist_ok=True)

for csv_file in input_dir.glob('*.csv'):
    df = pd.read_csv(csv_file, encoding=source_encoding)
    df.to_csv(output_dir / csv_file.name, index=False, encoding='utf-8')
    print(f'Converted: {csv_file.name}')

This is the right approach if you receive weekly Windows exports from an ERP or CRM and need them clean before loading into Google Sheets or a database.

Abstract illustration showing a row of mismatched colored blocks on the left — each a different shade — being fed through a central conversion funnel shape, then emerging on the right as a uniform, orderly row of matching blocks

How to Fix Encoding Errors on the Mac Command Line

For quick one-off conversions, iconv and file are fast and reliable.

Convert encoding with iconv

# Windows-1252 → UTF-8
iconv -f cp1252 -t utf-8 input.csv > output.csv

# ISO-8859-1 → UTF-8
iconv -f iso-8859-1 -t utf-8 input.csv > output.csv

# Shift-JIS → UTF-8 (Japanese files)
iconv -f shift_jis -t utf-8 input.csv > output.csv

cp1252 is the iconv code for Windows-1252. Run iconv -l to see all supported encodings.

If iconv encounters a byte it can't convert, it exits with an error and produces an empty output file. Add -c to skip unconvertible bytes:

iconv -c -f cp1252 -t utf-8 input.csv > output.csv

This silently drops characters that can't be converted. Usually acceptable for a handful of stray bytes in a large export; not acceptable if accuracy matters on every row.

Common Scenarios and Quick Fixes

Scenario 1: "My CSV from Excel has garbage characters"

Windows Excel saves CSV as Windows-1252 by default when you choose "Comma Separated Values (.csv)". The modern workaround is to save as "CSV UTF-8 (Comma delimited) (*.csv)" instead — that option was added in Excel 2019.

If you can't change the export: iconv -f cp1252 -t utf-8 input.csv > output.csv before importing.

Scenario 2: "Google Sheets exported a CSV and now it doesn't work in another tool"

Google Sheets exports UTF-8 without BOM. Most tools handle this fine. If you're importing into Windows Excel and seeing issues, add a BOM: python3 -c "import sys; sys.stdout.buffer.write(b'\xef\xbb\xbf'); sys.stdout.buffer.write(open('file.csv','rb').read())" > file_bom.csv

Or just reopen in Google Sheets and download again via File → Download → CSV — it'll still be UTF-8 without BOM, so the tool receiving it needs to handle that correctly.

Scenario 3: "Only some rows have garbled characters"

This means different rows were encoded differently — common when data was merged from multiple sources. You'll need Python with chardet to detect encoding row-by-row, or open the raw file in a hex editor to see exactly which bytes are misbehaving.

Scenario 4: "Accented characters are missing entirely — just replaced with ?"

This happens when a converter tried to transcode and couldn't map the character, then used ? as a fallback. The original bytes are gone. You need to re-export from the source — there's no way to recover replaced data.

Scenario 5: "The file looks fine on my machine but garbled on someone else's"

Your machine is set to a regional locale that matches the file's encoding. Their machine has a different default. The fix is to explicitly export as UTF-8 so the encoding is specified rather than assumed.

How to Prevent Encoding Problems

The root cause of almost every encoding error is an assumption: the tool assumes UTF-8, the file is Windows-1252. Remove the assumption.

When exporting CSV:

Excel on Windows: always choose "CSV UTF-8 (Comma delimited) (*.csv)", not the plain CSV option
Excel on Mac: defaults to UTF-8, no action needed
Google Sheets: exports UTF-8 by default — fine
Legacy systems: check your export settings; add a BOM if recipients use Windows tools

When importing CSV:

Always specify the encoding explicitly in the import dialog
Don't rely on auto-detection for files from mixed-encoding sources
Validate after import: spot-check rows with accented characters or special symbols

When sharing CSV files:

If you're not sure, convert to UTF-8 before sending: iconv -f cp1252 -t utf-8 input.csv > input_utf8.csv
Communicate the encoding in the filename if the recipient needs to know: export_utf8.csv

Frequently Asked Questions

Q: What's the difference between UTF-8, UTF-8 with BOM, and UTF-16? A: UTF-8 is the standard web encoding — no BOM. UTF-8 with BOM adds three invisible bytes at the start so Windows tools can identify the encoding automatically. UTF-16 uses 2 bytes per character (or 4 for rare characters) and is used by some Windows applications for Unicode support. For CSV files, UTF-8 without BOM is the right choice for anything leaving Windows; UTF-8 with BOM if the output goes directly into Windows Excel.

Q: Why does Excel on Windows save CSVs in Windows-1252 when UTF-8 exists? A: Legacy compatibility. Windows-1252 has been Excel's default encoding for decades. Microsoft added "CSV UTF-8" as a separate format option in Excel 2019 rather than changing the default to avoid breaking existing workflows. Old habits persist.

Q: Is iconv safe to use on large files? A: Yes — iconv streams the input and produces output incrementally. It doesn't load the whole file into memory. You can run it on multi-GB CSV files without issue.

Q: Can I fix encoding in Google Sheets without re-importing? A: Not really. Once the file is imported with the wrong encoding, the bytes have already been misinterpreted and the decoded text is wrong. You need to re-import with the correct encoding specified. There's no in-sheet encoding setting that reinterprets existing data.

Q: My file reports UTF-8 but still has garbled text — why? A: The file may have a BOM marker confusing the reader, or it may have been "converted" to UTF-8 without actually correcting the underlying bytes. Open the raw bytes with hexdump -C yourfile.csv | head and check whether characters above 0x7F look like properly-formed UTF-8 multi-byte sequences or like raw Latin-1 bytes.

Q: Does CSVtoSheets handle encoding automatically? A: Yes — CSVtoSheets detects encoding before passing your CSV to Google Sheets, so accented characters and special symbols arrive correctly. It's one of the common failure points in the manual double-click workflow on Mac, where Finder opens the file with the system locale's default encoding rather than inspecting the file.

How to Fix Character Encoding Errors in CSV Files

What Character Encoding Errors Look Like

Why This Happens: A 90-Second Encoding Primer

Step 1: Identify the Actual Encoding

On Mac: Use the `file` command

Open the raw bytes in a text editor

Check where the file came from

How to Fix Encoding Errors in Google Sheets

Method 1: Use the Import Wizard

Method 2: Convert the File to UTF-8 First

How to Fix Encoding Errors in Excel (Mac)

Use the Text Import Wizard

The UTF-8 BOM Problem (Excel-specific)

How to Fix Encoding Errors With Python

Detect encoding with chardet

Re-read and convert with pandas

Batch convert a directory

How to Fix Encoding Errors on the Mac Command Line

Convert encoding with iconv

Common Scenarios and Quick Fixes

Scenario 1: "My CSV from Excel has garbage characters"

Scenario 2: "Google Sheets exported a CSV and now it doesn't work in another tool"

Scenario 3: "Only some rows have garbled characters"

Scenario 4: "Accented characters are missing entirely — just replaced with ?"

Scenario 5: "The file looks fine on my machine but garbled on someone else's"

How to Prevent Encoding Problems

Frequently Asked Questions

Related Resources

Ready to Stop Fighting with CSV Files?

What Character Encoding Errors Look Like

Why This Happens: A 90-Second Encoding Primer

Step 1: Identify the Actual Encoding

On Mac: Use the file command

Open the raw bytes in a text editor

Check where the file came from

How to Fix Encoding Errors in Google Sheets

Method 1: Use the Import Wizard

Method 2: Convert the File to UTF-8 First

How to Fix Encoding Errors in Excel (Mac)

Use the Text Import Wizard

The UTF-8 BOM Problem (Excel-specific)

How to Fix Encoding Errors With Python

Detect encoding with chardet

Re-read and convert with pandas

Batch convert a directory

How to Fix Encoding Errors on the Mac Command Line

Convert encoding with iconv

Common Scenarios and Quick Fixes

Scenario 1: "My CSV from Excel has garbage characters"

Scenario 2: "Google Sheets exported a CSV and now it doesn't work in another tool"

Scenario 3: "Only some rows have garbled characters"

Scenario 4: "Accented characters are missing entirely — just replaced with ?"

Scenario 5: "The file looks fine on my machine but garbled on someone else's"

How to Prevent Encoding Problems

Frequently Asked Questions

Related Resources

Ready to Stop Fighting with CSV Files?

On Mac: Use the `file` command