Joe Gurr

End Of Line Characters

"This Unicode technical specification is extremely dull reading ... but it does have many interesting characters."

I don't know enough about character encoding. I'm trying though! I've ready Joel Spolsky's article The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) twice this week and have yet to fully commit it all to memory.

This week at work I had an issue with end of line characters. Basically I'm working with an enterprise document generation platform that accepts data in whatever form you'd like, transforms it, and produces a nice pretty output.

For this particular job we were sent an xlsx file, for this job our software is programmed to accept csv files, so I had to convert one file format into another. Easy enough.

I don't have a microsoft account on my machine (because I don't need one), so this involved me opening up the spreadsheet in an unlicensed version of Excel and copying the data into a text editor and running a program like sed over it to reformat it. Pretty simple.

When the data get's complex sometimes I have to open the xlsx file with Mac's Numbers and export to a csv from there.

I hunted a bug for about 2 hours this week simply because the export function from Mac's Numbers and the 'reformat in a text editor' method resulted in different end of line encodings.

I'm not going to make that mistake again ...

I'm learning a lot about how different tools (Windows and Microsoft) serve different purposes to those I'm used to. I'm not sure how much I like using these tools, but I'm glad to be exposed to them.