In case anybody else finds it useful.
CP1252 Dec CP1252 Hex Unicode Dec Unicode Hex UTF-8 Bytes CP1252 Char Unicode Char ---------- ---------- ----------- ----------- ----------- ----------- ------------ 128 0x80 8364 0x20ac e282ac 129 0x81 65533 0xfffd efbfbd � 130 0x82 8218 0x201a e2809a 131 0x83 402 0x0192 c692 132 0x84 8222 0x201e e2809e 133 0x85 8230 0x2026 e280a6 134 0x86 8224 0x2020 e280a0 135 0x87 8225 0x2021 e280a1 136 0x88 710 0x02c6 cb86 137 0x89 8240 0x2030 e280b0 138 0x8a 352 0x0160 c5a0 139 0x8b 8249 0x2039 e280b9 140 0x8c 338 0x0152 c592 141 0x8d 65533 0xfffd efbfbd � 142 0x8e 381 0x017d c5bd 143 0x8f 65533 0xfffd efbfbd � 144 0x90 65533 0xfffd efbfbd � 145 0x91 8216 0x2018 e28098 146 0x92 8217 0x2019 e28099 147 0x93 8220 0x201c e2809c 148 0x94 8221 0x201d e2809d 149 0x95 8226 0x2022 e280a2 150 0x96 8211 0x2013 e28093 151 0x97 8212 0x2014 e28094 152 0x98 732 0x02dc cb9c 153 0x99 8482 0x2122 e284a2 154 0x9a 353 0x0161 c5a1 155 0x9b 8250 0x203a e280ba 156 0x9c 339 0x0153 c593 157 0x9d 65533 0xfffd efbfbd � 158 0x9e 382 0x017e c5be 159 0x9f 376 0x0178 c5b8 160 0xa0 160 0x00a0 c2a0 161 0xa1 161 0x00a1 c2a1 162 0xa2 162 0x00a2 c2a2 163 0xa3 163 0x00a3 c2a3 164 0xa4 164 0x00a4 c2a4 165 0xa5 165 0x00a5 c2a5 166 0xa6 166 0x00a6 c2a6 167 0xa7 167 0x00a7 c2a7 168 0xa8 168 0x00a8 c2a8 169 0xa9 169 0x00a9 c2a9 170 0xaa 170 0x00aa c2aa 171 0xab 171 0x00ab c2ab 172 0xac 172 0x00ac c2ac 173 0xad 173 0x00ad c2ad 174 0xae 174 0x00ae c2ae 175 0xaf 175 0x00af c2af 176 0xb0 176 0x00b0 c2b0 177 0xb1 177 0x00b1 c2b1 178 0xb2 178 0x00b2 c2b2 179 0xb3 179 0x00b3 c2b3 180 0xb4 180 0x00b4 c2b4 181 0xb5 181 0x00b5 c2b5 182 0xb6 182 0x00b6 c2b6 183 0xb7 183 0x00b7 c2b7 184 0xb8 184 0x00b8 c2b8 185 0xb9 185 0x00b9 c2b9 186 0xba 186 0x00ba c2ba 187 0xbb 187 0x00bb c2bb 188 0xbc 188 0x00bc c2bc 189 0xbd 189 0x00bd c2bd 190 0xbe 190 0x00be c2be 191 0xbf 191 0x00bf c2bf 192 0xc0 192 0x00c0 c380 193 0xc1 193 0x00c1 c381 194 0xc2 194 0x00c2 c382 195 0xc3 195 0x00c3 c383 196 0xc4 196 0x00c4 c384 197 0xc5 197 0x00c5 c385 198 0xc6 198 0x00c6 c386 199 0xc7 199 0x00c7 c387 200 0xc8 200 0x00c8 c388 201 0xc9 201 0x00c9 c389 202 0xca 202 0x00ca c38a 203 0xcb 203 0x00cb c38b 204 0xcc 204 0x00cc c38c 205 0xcd 205 0x00cd c38d 206 0xce 206 0x00ce c38e 207 0xcf 207 0x00cf c38f 208 0xd0 208 0x00d0 c390 209 0xd1 209 0x00d1 c391 210 0xd2 210 0x00d2 c392 211 0xd3 211 0x00d3 c393 212 0xd4 212 0x00d4 c394 213 0xd5 213 0x00d5 c395 214 0xd6 214 0x00d6 c396 215 0xd7 215 0x00d7 c397 216 0xd8 216 0x00d8 c398 217 0xd9 217 0x00d9 c399 218 0xda 218 0x00da c39a 219 0xdb 219 0x00db c39b 220 0xdc 220 0x00dc c39c 221 0xdd 221 0x00dd c39d 222 0xde 222 0x00de c39e 223 0xdf 223 0x00df c39f 224 0xe0 224 0x00e0 c3a0 225 0xe1 225 0x00e1 c3a1 226 0xe2 226 0x00e2 c3a2 227 0xe3 227 0x00e3 c3a3 228 0xe4 228 0x00e4 c3a4 229 0xe5 229 0x00e5 c3a5 230 0xe6 230 0x00e6 c3a6 231 0xe7 231 0x00e7 c3a7 232 0xe8 232 0x00e8 c3a8 233 0xe9 233 0x00e9 c3a9 234 0xea 234 0x00ea c3aa 235 0xeb 235 0x00eb c3ab 236 0xec 236 0x00ec c3ac 237 0xed 237 0x00ed c3ad 238 0xee 238 0x00ee c3ae 239 0xef 239 0x00ef c3af 240 0xf0 240 0x00f0 c3b0 241 0xf1 241 0x00f1 c3b1 242 0xf2 242 0x00f2 c3b2 243 0xf3 243 0x00f3 c3b3 244 0xf4 244 0x00f4 c3b4 245 0xf5 245 0x00f5 c3b5 246 0xf6 246 0x00f6 c3b6 247 0xf7 247 0x00f7 c3b7 248 0xf8 248 0x00f8 c3b8 249 0xf9 249 0x00f9 c3b9 250 0xfa 250 0x00fa c3ba 251 0xfb 251 0x00fb c3bb 252 0xfc 252 0x00fc c3bc 253 0xfd 253 0x00fd c3bd 254 0xfe 254 0x00fe c3be 255 0xff 255 0x00ff c3bf
For ISO 8859-1 (Latin-1), the range 0x80-0x9F is not defined; outside this range it's the same as CP1252.
The cases where the Unicode character for a given CP1252 "character" is 0xfffd means that CP1252 doesn't actually define a character for that particular byte value. These are "holes" in the character set. The holes are: 0x81, 0x8d, 0x8f, and 0x90.
#!/usr/local/bin/perl -w use strict; use Encode; sub hex { return join '', map { sprintf("%02x", ord($_)) } split(//, $_[0]); } print "CP1252 Dec CP1252 Hex Unicode Dec Unicode Hex UTF-8 Bytes CP1252 Char Unicode Char\n"; print "---------- ---------- ----------- ----------- ----------- ----------- ------------\n"; for (my $i = 0x80; $i <= 0xFF; $i++) { my $ch = chr($i); my $native = Encode::decode("cp1252", $ch); my $utf8 = Encode::encode("utf-8", $native); printf "%-12d 0x%02x %-13d 0x%04x %-12s &#%d; &#%d;\n", $i, $i, ord($native), ord($native), hex($utf8), $i, ord($native); }