I'm working on a project where I need to dynamically read in content from a flat file. The flat file contains some template information and then I'm also pasting in the contents (using TextEdit) of a file that was generated using Perl database access.
The problem was that when I read it into a string using stringWithContentsOfFile and NSUTF8StringEncoding it would blow up. I could use NSASCIIStringEncoding, but then some of the characters (e.g em dash, single double quote) were translated incorrectly. If I brought this file up in TextEdit or Dashcode, everything looked great. Displaying the file in vi or the command line did not.
When I did a
file -I foo.txt
it reported the file type was "unknown" although the Perl generated file was utf-8.
I traced the problem down to the TextEdit "Plain Text File Encoding" preferences. Both "Opening Files" and "Saving Files" were set to UTF-8. This was helpful to read the data, but somehow when saving the pasted-in content, it caused the file type to get hosed such that certain tools (e.g. my command window which is set to UTF-8) could no longer properly read the characters.
Once I set the preferences back to automatic, the saved file is now utf-8, but the special characters don't display correctly in TextEdit. The file also loads with UTF8 encoding into an NSString.
Weird.
That's three hours of my life I'll never get back.